Application controls for speech enabled recognition

ABSTRACT

Controls are provided for a web server to generate client side markups that include recognition and/or audible prompting. The controls comprise elements of a dialog such as a prompt, answer, confirmation, command and validation. An application control provides a means to wrap common speech scenarios in one control.

BACKGROUND OF THE INVENTION

[0001] The present invention generally relates to encoding computers toperform a specific application. More particularly, the present inventionrelates to controls for defining an application to perform recognitionand/or audible prompting such as a server that generates client sidemarkup enabled with recognition and/or audible prompting.

[0002] Small computing devices such as personal digital assistants(PDA), devices and portable phones are used with ever increasingfrequency by people in their day-to-day activities. With the increase inprocessing power now available for microprocessors used to run thesedevices, the functionality of these devices are increasing, and in somecases, merging. For instance, many portable phones now can be used toaccess and browse the Internet as well as can be used to store personalinformation such as addresses, phone numbers and the like.

[0003] In view that these computing devices are being used for browsingthe Internet, or are used in other server/client architectures, it istherefore necessary to enter information into the computing device.Unfortunately, due to the desire to keep these devices as small aspossible in order that they are easily carried, conventional keyboardshaving all the letters of the alphabet as isolated buttons are usuallynot possible due to the limited surface area available on the housingsof the computing devices.

[0004] To address this problem, there has been increased interest andadoption of using voice or speech to access information over a wide areanetwork such as the Internet. For example, voice portals such as throughthe use of VoiceXML (voice extensible markup language) have beenadvanced to allow Internet content to be accessed using only atelephone. In this architecture, a document server (for example, a webserver) processes requests from a client through a VoiceXML interpreter.The web server can produce VoiceXML documents in reply, which areprocessed by the VoiceXML interpreter and rendered audibly to the user.Using voice commands through voice recognition, the user can navigatethe web.

[0005] Generally, there are two techniques of “speech enabling”information or web content. In the first technique, existing visualmarkup language pages typically visually rendered by a device having adisplay are interpreted and rendered aurally. However, this approachoften yields poor results because pages meant for visual interactionusually do not have enough information to create a sensible aural dialogautomatically. In addition, voice interaction is prone to error,especially over noisy channels such as a telephone. Without visual orother forms of persistent feedback, navigation through the web serverapplication can be extremely difficult for the user. This approach thusrequires mechanisms such as help messages, which are also renderedaudibly to the user in order to help them navigate through the website.The mechanisms are commonly referred to as “voice dialogs”, which alsomust address errors when incorrect information or no information isprovided by the user, for example, in response to an audible question.Since the mechanisms are not commonly based on the visual content of theweb page, they cannot be generated automatically, and thereforetypically require extensive development time by the applicationdeveloper.

[0006] A second approach to speech enabling web content, includeswriting specific voice pages in a new language. An advantage of thisapproach is that the speech-enabled page contains all the mechanismsneeded for aural dialog such as repairs and navigational help. However,a significant disadvantage is that the application pages must then beadapted to include the application logic as found in the visual contentpages. In other words, the application logic of the visual content pagesmust be rewritten in the form of the speech-enable language. Even whenthis process can be automated by the use of tools creating visual andaural pages from the same specification, maintenance of the visual andspeech enabled pages is usually difficult to synchronize. In addition,this approach does not easily allow multimodal applications, for examplewhere both visual and speech interaction is provided on the web page.Since the visual and speech-enabled pages are unrelated, the input andoutput logic is not easily coordinated to work with each other.

[0007] To date, speech interaction is also cumbersome due to theorganization or format currently used as the interface. Generally, thespeech interface either tends to be tied too closely to the businesslogic of the application, which inhibits re-use of the elements of thespeech interface in other applications, or the speech interface is toorestricted by a simplistic dialog model (e.g. forms and fields).

[0008] As a result of the difficulties in developing speech interactionapplications, authoring of the applications is costly and timeconsuming. There is thus an ongoing need to improve upon thearchitecture and methods used to provide speech recognition in anapplication such as server/client architecture such as the Internet. Inparticular, a method, system or authoring tool that addresses one,several or all of the foregoing disadvantages and thus providesgeneration of speech-enabled recognition and/or speech-enabled promptingin an application is needed.

SUMMARY OF THE INVENTION

[0009] Controls are provided for a web server to generate client sidemarkups that include recognition and/or audible prompting. The controlscomprise elements of a dialog such as a prompt, answer, confirmation,command and validation. An application control provides a means to wrapcommon speech scenarios in one control.

[0010] The controls, when executed on a computer, generate client sidemarkup for a client in a client/server system. A first set of visualcontrols have attributes for visual rendering on the client device,while a second set of controls have attributes related to at least oneof recognition and audibly prompting. An application control is used toperform a selected task on the client device. The application controlhas properties for outputting controls of the second set to perform theselected task and associating the outputted controls with the first setof controls.

[0011] In short, an application control, which can take many differentforms such as provided in Appendix D, allows the application author torapidly develop an application by using application controls rather thanmanually coding all the necessary syntax with the first and second setof controls to perform a selected task. The tasks can include obtaininginformation, e.g. numbers, characters, dates etc., or navigating a tableof information. The application that is developed may include variousbuilt-in prompts, grammars and dialog flow or generate these featuresautomatically. Use of the controls saves time and cost in development.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012]FIG. 1 is a plan view of a first embodiment of a computing deviceoperating environment.

[0013]FIG. 2 is a block diagram of the computing device of FIG. 1.

[0014]FIG. 3 is a block diagram of a general purpose computer.

[0015]FIG. 4 is a block diagram of an architecture for a client/serversystem.

[0016]FIG. 5 is a display for obtaining credit card information.

[0017]FIG. 6 is an exemplary page of mark-up language executable on aclient having a display and voice recognition capabilities.

[0018]FIG. 7 is a block diagram illustrating a first approach forproviding recognition and audible prompting in client side markups.

[0019]FIG. 8 is a block diagram illustrating a second approach forproviding recognition and audible prompting in client side markups.

[0020]FIG. 9 is a block diagram illustrating a third approach forproviding recognition and audible prompting in client side markups.

[0021]FIG. 10 is a block diagram illustrating companion controls.

[0022]FIG. 11 is a detailed block diagram illustrating companioncontrols of a first embodiment.

[0023]FIG. 12 is a block diagram illustrating companion controls of asecond embodiment.

[0024]FIG. 13 is a block diagram illustrating speech controlsinheritance for the second embodiment.

[0025]FIG. 14 is a schematic illustration for a system to generatenavigator control code.

[0026]FIG. 15 is a schematic illustration of task that may be completedby an author in order to generate navigator control code.

[0027]FIG. 16 is an exemplary table that can be navigated.

[0028]FIG. 17 is a flow diagram of an exemplary method used fornavigating a table.

[0029]FIGS. 18 and 19 are examples of table navigation.

DETAILED DESCRIPTION OF THE ILLUSTRATIVE EMBODIMENTS

[0030] Before describing architecture of web based recognition andmethods for implementing the same, it may be useful to describegenerally computing devices that can function in the architecture.Referring now to FIG. 1, an exemplary form of a data management device(PIM, PDA or the like) is illustrated at 30. However, it is contemplatedthat the present invention can also be practiced using other computingdevices discussed below, and in particular, those computing deviceshaving limited surface areas for input buttons or the like. For example,phones and/or data management devices will also benefit from the presentinvention. Such devices will have an enhanced utility compared toexisting portable personal information management devices and otherportable electronic devices, and the functions and compact size of suchdevices will more likely encourage the user to carry the device at alltimes. Accordingly, it is not intended that the scope of thearchitecture herein described be limited by the disclosure of anexemplary data management or PIM device, phone or computer hereinillustrated.

[0031] An exemplary form of a data management mobile device 30 isillustrated in FIG. 1. The mobile device 30 includes a housing 32 andhas an user interface including a display 34, which uses a contactsensitive display screen in conjunction with a stylus 33. The stylus 33is used to press or contact the display 34 at designated coordinates toselect a field, to selectively move a starting position of a cursor, orto otherwise provide command information such as through gestures orhandwriting. Alternatively, or in addition, one or more buttons 35 canbe included on the device 30 for navigation. In addition, other inputmechanisms such as rotatable wheels, rollers or the like can also beprovided. However, it should be noted that the invention is not intendedto be limited by these forms of input mechanisms. For instance, anotherform of input can include a visual input such as through computervision.

[0032] Referring now to FIG. 2, a block diagram illustrates thefunctional components comprising the mobile device 30. A centralprocessing unit (CPU) 50 implements the software control functions. CPU50 is coupled to display 34 so that text and graphic icons generated inaccordance with the controlling software appear on the display 34. Aspeaker 43 can be coupled to CPU 50 typically with a digital-to-analogconverter 59 to provide an audible output. Data that is downloaded orentered by the user into the mobile device 30 is stored in anon-volatile read/write random access memory store 54 bi-directionallycoupled to the CPU 50. Random access memory (RAM) 54 provides volatilestorage for instructions that are executed by CPU 50, and storage fortemporary data, such as register values. Default values forconfiguration options and other variables are stored in a read onlymemory (ROM) 58. ROM 58 can also be used to store the operating systemsoftware for the device that controls the basic functionality of themobile 30 and other operating system kernel functions (e.g., the loadingof software components into RAM 54).

[0033] RAM 54 also serves as a storage for the code in the manneranalogous to the function of a hard drive on a PC that is used to storeapplication programs. It should be noted that although non-volatilememory is used for storing the code, it alternatively can be stored involatile memory that is not used for execution of the code.

[0034] Wireless signals can be transmitted/received by the mobile devicethrough a wireless transceiver 52, which is coupled to CPU 50. Anoptional communication interface 60 can also be provided for downloadingdata directly from a computer (e.g., desktop computer), or from a wirednetwork, if desired. Accordingly, interface 60 can comprise variousforms of communication devices, for example, an infrared link, modem, anetwork card, or the like.

[0035] Mobile device 30 includes a microphone 29, and analog-to-digital(A/D) converter 37, and an optional recognition program (speech, DTMF,handwriting, gesture or computer vision) stored in store 54. By way ofexample, in response to audible information, instructions or commandsfrom a user of device 30, microphone 29 provides speech signals, whichare digitized by A/D converter 37. The speech recognition program canperform normalization and/or feature extraction functions on thedigitized speech signals to obtain intermediate speech recognitionresults. Using wireless transceiver 52 or communication interface 60,speech data is transmitted to a remote recognition server 204 discussedbelow and illustrated in the architecture of FIG. 5. Recognition resultsare then returned to mobile device 30 for rendering (e.g. visual and/oraudible) thereon, and eventual transmission to a web server 202 (FIG.5), wherein the web server 202 and mobile device 30 operate in aclient/server relationship. Similar processing can be used for otherforms of input. For example, handwriting input can be digitized with orwithout pre-processing on device 30. Like the speech data, this form ofinput can be transmitted to the recognition server 204 for recognitionwherein the recognition results are returned to at least one of thedevice 30 and/or web server 202. Likewise, DTMF data, gesture data andvisual data can be processed similarly. Depending on the form of input,device 30 (and the other forms of clients discussed below) would includenecessary hardware such as a camera for visual input.

[0036] In addition to the portable or mobile computing devices describedabove, it should also be understood that the present invention can beused with numerous other computing devices such as a general desktopcomputer. For instance, the present invention will allow a user withlimited physical abilities to input or enter text into a computer orother computing device when other conventional input devices, such as afull alpha-numeric keyboard, are too difficult to operate.

[0037] The invention is also operational with numerous other generalpurpose or special purpose computing systems, environments orconfigurations. Examples of well known computing systems, environments,and/or configurations that may be suitable for use with the inventioninclude, but are not limited to, wireless or cellular telephones,regular telephones (without any screen), personal computers, servercomputers, hand-held or laptop devices, multiprocessor systems,microprocessor-based systems, set top boxes, programmable consumerelectronics, network PCs, minicomputers, mainframe computers,distributed computing environments that include any of the above systemsor devices, and the like.

[0038] The following is a brief description of a general purposecomputer 120 illustrated in FIG. 3. However, the computer 120 is againonly one example of a suitable computing environment and is not intendedto suggest any limitation as to the scope of use or functionality of theinvention. Neither should the computer 120 be interpreted as having anydependency or requirement relating to any one or combination ofcomponents illustrated therein.

[0039] The invention may be described in the general context ofcomputer-executable instructions, such as program modules, beingexecuted by a computer. Generally, program modules include routines,programs, objects, components, data structures, etc. that performparticular tasks or implement particular abstract data types. Theinvention may also be practiced in distributed computing environmentswhere tasks are performed by remote processing devices that are linkedthrough a communications network. In a distributed computingenvironment, program modules may be located in both local and remotecomputer storage media including memory storage devices. Tasks performedby the programs and modules are described below and with the aid offigures. Those skilled in the art can implement the description andfigures as processor executable instructions, which can be written onany form of a computer readable medium.

[0040] With reference to FIG. 3, components of computer 120 may include,but are not limited to, a processing unit 140, a system memory 150, anda system bus 141 that couples various system components including thesystem memory to the processing unit 140. The system bus 141 may be anyof several types of bus structures including a memory bus or memorycontroller, a peripheral bus, and a local bus using any of a variety ofbus architectures. By way of example, and not limitation, sucharchitectures include Industry Standard Architecture (ISA) bus,Universal Serial Bus (USB), Micro Channel Architecture (MCA) bus,Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA)local bus, and Peripheral Component Interconnect (PCI) bus also known asMezzanine bus. Computer 120 typically includes a variety of computerreadable mediums. Computer readable mediums can be any available mediathat can be accessed by computer 120 and includes both volatile andnonvolatile media, removable and non-removable media. By way of example,and not limitation, computer readable mediums may comprise computerstorage media and communication media. Computer storage media includesboth volatile and nonvolatile, removable and non-removable mediaimplemented in any method or technology for storage of information suchas computer readable instructions, data structures, program modules orother data. Computer storage media includes, but is not limited to, RAM,ROM, EEPROM, flash memory or other memory technology, CD-ROM, digitalversatile disks (DVD) or other optical disk storage, magnetic cassettes,magnetic tape, magnetic disk storage or other magnetic storage devices,or any other medium which can be used to store the desired informationand which can be accessed by computer 120.

[0041] Communication media typically embodies computer readableinstructions, data structures, program modules or other data in amodulated data signal such as a carrier wave or other transportmechanism and includes any information delivery media. The term“modulated data signal” means a signal that has one or more of itscharacteristics set or changed in such a manner as to encode informationin the signal. By way of example, and not limitation, communicationmedia includes wired media such as a wired network or direct-wiredconnection, and wireless media such as acoustic, FR, infrared and otherwireless media. Combinations of any of the above should also be includedwithin the scope of computer readable media.

[0042] The system memory 150 includes computer storage media in the formof volatile and/or nonvolatile memory such as read only memory (ROM) 151and random access memory (RAM) 152. A basic input/output system 153(BIOS), containing the basic routines that help to transfer informationbetween elements within computer 120, such as during start-up, istypically stored in ROM 151. RAM 152 typically contains data and/orprogram modules that are immediately accessible to and/or presentlybeing operated on by processing unit 140. By way of example, and notlimitation, FIG. 3 illustrates operating system 54, application programs155, other program modules 156, and program data 157.

[0043] The computer 120 may also include other removable/non-removablevolatile/nonvolatile computer storage media. By way of example only,FIG. 3 illustrates a hard disk drive 161 that reads from or writes tonon-removable, nonvolatile magnetic media, a magnetic disk drive 171that reads from or writes to a removable, nonvolatile magnetic disk 172,and an optical disk drive 175 that reads from or writes to a removable,nonvolatile optical disk 176 such as a CD ROM or other optical media.Other removable/non-removable, volatile/nonvolatile computer storagemedia that can be used in the exemplary operating environment include,but are not limited to, magnetic tape cassettes, flash memory cards,digital versatile disks, digital video tape, solid state RAM, solidstate ROM, and the like. The hard disk drive 161 is typically connectedto the system bus 141 through a non-removable memory interface such asinterface 160, and magnetic disk drive 171 and optical disk drive 175are typically connected to the system bus 141 by a removable memoryinterface, such as interface 170.

[0044] The drives and their associated computer storage media discussedabove and illustrated in FIG. 3, provide storage of computer readableinstructions, data structures, program modules and other data for thecomputer 120. In FIG. 3, for example, hard disk drive 161 is illustratedas storing operating system 164, application programs 165, other programmodules 166, and program data 167. Note that these components can eitherbe the same as or different from operating system 154, applicationprograms 155, other program modules 156, and program data 157. Operatingsystem 164, application programs 165, other program modules 166, andprogram data 167 are given different numbers here to illustrate that, ata minimum, they are different copies.

[0045] A user may enter commands and information into the computer 120through input devices such as a keyboard 182, a microphone 183, and apointing device 181, such as a mouse, trackball or touch pad. Otherinput devices (not shown) may include a joystick, game pad, satellitedish, scanner, or the like. These and other input devices are oftenconnected to the processing unit 140 through a user input interface 180that is coupled to the system bus, but may be connected by otherinterface and bus structures, such as a parallel port, game port or auniversal serial bus (USB). A monitor 184 or other type of displaydevice is also connected to the system bus 141 via an interface, such asa video interface 185. In addition to the monitor, computers may alsoinclude other peripheral output devices such as speakers 187 and printer186, which may be connected through an output peripheral interface 188.

[0046] The computer 120 may operate in a networked environment usinglogical connections to one or more remote computers, such as a remotecomputer 194. The remote computer 194 may be a personal computer, ahand-held device, a server, a router, a network PC, a peer device orother common network node, and typically includes many or all of theelements described above relative to the computer 120. The logicalconnections depicted in FIG. 3 include a local area network (LAN) 191and a wide area network (WAN) 193, but may also include other networks.Such networking environments are commonplace in offices, enterprise-widecomputer networks, intranets and the Internet.

[0047] When used in a LAN networking environment, the computer 120 isconnected to the LAN 191 through a network interface or adapter 190.When used in a WAN networking environment, the computer 120 typicallyincludes a modem 192 or other means for establishing communications overthe WAN 193, such as the Internet. The modem 192, which may be internalor external, may be connected to the system bus 141 via the user inputinterface 180, or other appropriate mechanism. In a networkedenvironment, program modules depicted relative to the computer 120, orportions thereof, may be stored in the remote memory storage device. Byway of example, and not limitation, FIG. 3 illustrates remoteapplication programs 195 as residing on remote computer 194. It will beappreciated that the network connections shown are exemplary and othermeans of establishing a communications link between the computers may beused.

Exemplary Architecture

[0048]FIG. 4 illustrates architecture 200 for web based recognition ascan be used with the present invention. Generally, information stored ina web server 202 can be accessed through mobile device 30 (which hereinalso represents other forms of computing devices having a displayscreen, a microphone, a camera, a touch sensitive panel, etc., asrequired based on the form of input), or through phone 80 whereininformation is requested audibly or through tones generated by phone 80in response to keys depressed and wherein information from web server202 is provided only audibly back to the user.

[0049] In this exemplary embodiment, Architecture 200 is unified in thatwhether information is obtained through device 30 or phone 80 usingspeech recognition, a single recognition server 204 can support eithermode of operation. In addition, architecture 200 operates using anextension of well-known markup languages (e.g. HTML, XHTML, cHTML, XML,WML, and the like). Thus, information stored on web server 202 can alsobe accessed using well-known GUI methods found in these markuplanguages. By using an extension of well-known markup languages,authoring on the web server 202 is easier, and legacy applicationscurrently existing can be also easily modified to include voice or otherforms of recognition.

[0050] Generally, device 30 executes HTML+ scripts, or the like,provided by web server 202. When voice recognition is required, by wayof example, speech data, which can be digitized audio signals or speechfeatures wherein the audio signals have been preprocessed by device 30as discussed above, are provided to recognition server 204 with anindication of a grammar or language model to use during speechrecognition. The implementation of the recognition server 204 can takemany forms, one of which is illustrated, but generally includes arecognizer 211. The results of recognition are provided back to device30 for local rendering if desired or appropriate. Upon compilation ofinformation through recognition and any graphical user interface ifused, device 30 sends the information to web server 202 for furtherprocessing and receipt of further HTML scripts, if necessary.

[0051] As illustrated in FIG. 4, device 30, web server 202 andrecognition server 204 are commonly connected, and separatelyaddressable, through a network 205, herein a wide area network such asthe Internet. It therefore is not necessary that any of these devices bephysically located adjacent to each other. In particular, it is notnecessary that web server 202 includes recognition server 204. In thismanner, authoring at web server 202 can be focused on the application towhich it is intended without the authors needing to know the intricaciesof recognition server 204. Rather, recognition server 204 can beindependently designed and connected to the network 205, and thereby, beupdated and improved without further changes required at web server 202.As discussed below, web server 202 can also include an authoringmechanism that can dynamically generate client-side markups and scripts.In a further embodiment, the web server 202, recognition server 204 andclient 30 may be combined depending on the capabilities of theimplementing machines. For instance, if the client comprises a generalpurpose computer, e.g. a personal computer, the client may include therecognition server 204. Likewise, if desired, the web server 202 andrecognition server 204 can be incorporated into a single machine.

[0052] Access to web server 202 through phone 80 includes connection ofphone 80 to a wired or wireless telephone network 208, that in turn,connects phone 80 to a third party gateway 210. Gateway 210 connectsphone 80 to a telephony voice browser 212. Telephone voice browser 212includes a media server 214 that provides a telephony interface and avoice browser 216. Like device 30, telephony voice browser 212 receivesHTML scripts or the like from web server 202. In one embodiment, theHTML scripts are of the form similar to HTML scripts provided to device30. In this manner, web server 202 need not support device 30 and phone80 separately, or even support standard GUI clients separately. Rather,a common markup language can be used. In addition, like device 30, voicerecognition from audible signals transmitted by phone 80 are providedfrom voice browser 216 to recognition server 204, either through thenetwork 205, or through a dedicated line 207, for example, using TCP/IP.Web server 202, recognition server 204 and telephone voice browser 212can be embodied in any suitable computing environment such as thegeneral purpose desktop computer illustrated in FIG. 3.

[0053] However, it should be noted that if DTMF recognition is employed,this form of recognition would generally be performed at the mediaserver 214, rather than at the recognition server 204. In other words,the DTMF grammar would be used by the media server 214.

[0054] Referring back to FIG. 4, web server 202 can include a serverside plug-in authoring tool or module 209 (e.g. ASP, ASP+, ASP.Net byMicrosoft Corporation, JSP, Javabeans, or the like). Server side plug-inmodule 209 can dynamically generate client-side markups and even aspecific form of markup for the type of client accessing the web server202. The client information can be provided to the web server 202 uponinitial establishment of the client/server relationship, or the webserver 202 can include modules or routines to detect the capabilities ofthe client device. In this manner, server side plug-in module 209 cangenerate a client side markup for each of the voice recognitionscenarios, i.e. voice only through phone 80 or multimodal for device 30.By using a consistent client side model, application authoring for manydifferent clients is significantly easier.

[0055] In addition to dynamically generating client side markups,high-level dialog modules, discussed below, can be implemented as aserver-side control stored in store 211 for use by developers inapplication authoring. In general, the high-level dialog modules 211would generate dynamically client-side markup and script in bothvoice-only and multimodal scenarios based on parameters specified bydevelopers. The high-level dialog modules 211 can include parameters togenerate client-side markups to fit the developers' needs.

Exemplary Client Side Extensions

[0056] Before describing dynamic generation of client-side markups towhich the present invention is directed, it may be helpful to firstdiscuss an exemplary form of extensions to the markup language for usein web based recognition.

[0057] As indicated above, the markup languages such as HTML, XHTMLcHTML, XML, WML or any other SGML-derived markup, which are used forinteraction between the web server 202 and the client device 30, areextended to include controls and/or objects that provide recognition ina client/server architecture. Generally, controls and/or objects caninclude one or more of the following functions: recognizer controlsand/or objects for recognizer configuration, recognizer execution and/orpost-processing; synthesizer controls and/or objects for synthesizerconfiguration and prompt playing; grammar controls and/or objects forspecifying input grammar resources; and/or binding controls and/orobjects for processing recognition results. The extensions are designedto be a lightweight markup layer, which adds the power of an audible,visual, handwriting, etc. interface to existing markup languages. Assuch, the extensions can remain independent of: the high-level page inwhich they are contained, e.g. HTML; the low-level formats which theextensions used to refer to linguistic resources, e.g. thetext-to-speech and grammar formats; and the individual properties of therecognition and speech synthesis platforms used in the recognitionserver 204. Although speech recognition will be discussed below, itshould be understood that the techniques, tags and server side controlsdescribed hereinafter can be similarly applied in handwritingrecognition, gesture recognition and image recognition.

[0058] In the exemplary embodiment, the extensions (also commonly knownas “tags”) are a small set of XML elements, with associated attributesand DOM object properties, events and methods, which may be used inconjunction with a source markup document to apply a recognition and/oraudible prompting interface, DTMF or call control to a source page. Theextensions' formalities and semantics are independent of the nature ofthe source document, so the extensions can be used equally effectivelywithin HTML, XHTML, cHTML, XML, WML, or with any other SGML-derivedmarkup. The extensions follow the document object model wherein newfunctional objects or elements, which can be hierarchical, are provided.Each of the elements are discussed in detail in the Appendix, butgenerally the elements can include attributes, properties, methods,events and/or other “child” elements.

[0059] At this point, it should also be noted that the extensions may beinterpreted in two different “modes” according to the capabilities ofthe device upon which the browser is being executed on. In a first mode,“object mode”, the full capabilities are available. The programmaticmanipulation of the extensions by an application is performed bywhatever mechanisms are enabled by the browser on the device, e.g. aJScript interpreter in an XHTML browser, or a WMLScript interpreter in aWML browser. For this reason, only a small set of core properties andmethods of the extensions need to be defined, and these manipulated bywhatever programmatic mechanisms exist on the device or client side. Theobject mode provides eventing and scripting and can offer greaterfunctionality to give the dialog author a much finer client-side controlover speech interactions. As used herein, a browser that supports fullevent and scripting is called an “uplevel browser”. This form of abrowser will support all the attributes, properties, methods and eventsof the extensions. Uplevel browsers are commonly found on devices withgreater processing capabilities.

[0060] The extensions can also be supported in a “declarative mode”. Asused herein, a browser operating in a declarative mode is called a“downlevel browser” and does not support full eventing and scriptingcapabilities. Rather, this form of browser will support the declarativeaspects of a given extension (i.e. the core element and attributes), butnot all the DOM (document object model) object properties, methods andevents. This mode employs exclusively declarative syntax, and mayfurther be used in conjunction with declarative multimediasynchronization and coordination mechanisms (synchronized markuplanguage) such as SMIL (Synchronized Multimedia Integration Language)2.0. Downlevel browsers will typically be found on devices with limitedprocessing capabilities.

[0061] At this point though, a particular mode of entry should bediscussed. In particular, use of speech recognition in conjunction withat least a display and, in a further embodiment, a pointing device aswell which enables the coordination of multiple modes of input, e.g. toindicate the fields for data entry, is particularly useful.Specifically, in this mode of data entry, the user is generally able tocoordinate the actions of the pointing device with the speech input, sofor example the user is under control of when to select a field andprovide corresponding information relevant to the field. For instance, acredit card submission graphical user interface (GUI) is illustrated inFIG. 5, a user could first decide to enter the credit card number infield 252 and then enter the type of credit card in field 250 followedby the expiration date in field 254. Likewise, the user could returnback to field 252 and correct an errant entry, if desired. When combinedwith speech recognition, an easy and natural form of navigation isprovided. As used herein, this form of entry using both a screen displayallowing free form actions of the pointing device on the screen, e.g.the selection of fields and recognition is called “multimodal”.

[0062] Referring to FIG. 6, a HTML markup language code is illustrated.The HTML code includes a body portion 270 and a script portion 272.Entry of information in each of the fields 250, 252 and 254 iscontrolled or executed by code portions 280, 282 and 284, respectively.Referring first to code portion 280, on selection of field 250, forexample, by use of stylus 33 of device 30, the event “onClick” isinitiated which calls or executes function “talk” in script portion 272.This action activates a grammar used for speech recognition that isassociated with the type of data generally expected in field 250. Thistype of interaction, which involves more than one technique of input(e.g. voice and pen-click/roller) is referred as “multimodal”.

[0063] Referring now back to the grammar, the grammar is a syntacticgrammar such as but not limited to a context-free grammar, a N-grammaror a hybrid grammar. (Of course, DTMF grammars, handwriting grammars,gesture grammars and image grammars would be used when correspondingforms of recognition are employed. As used herein, a “grammar” includesinformation for performing recognition, and in a further embodiment,information corresponding to expected input to be entered, for example,in a specific field.) A control 290 (herein identified as “reco”)includes various elements, two of which are illustrated, namely agrammar element “grammar” and a “bind” element. Generally, like the codedownloaded to a client from web server 202, the grammars can originateat web server 202 and be downloaded to the client and/or forwarded to aremote server for speech processing. The grammars can then be storedlocally thereon in a cache. Eventually, the grammars are provided to therecognition server 204 for use in recognition. The grammar element isused to specify grammars, either inline or referenced using anattribute.

[0064] Upon receipt of recognition results from recognition server 204corresponding to the recognized speech, handwriting, gesture, image,etc., syntax of reco control 290 is provided to receive thecorresponding results and associate it with the corresponding field,which can include rendering of the text therein on display 34. In theillustrated embodiment, upon completion of speech recognition with theresult sent back to the client, it deactivates the reco object andassociates the recognized text with the corresponding field. Portions282 and 284 operate similarly wherein unique reco objects and grammarsare called for each of the fields 252 and 254 and upon receipt of therecognized text is associated with each of the fields 252 and 254. Withrespect to receipt of the card number field 252, the function “handle”checks the length of the card number with respect to the card type.

Generation of Client Side Markups

[0065] As indicated above, server side plug-in module 209 outputs clientside markups when a request has been made from the client device 30. Inshort, the server side plug-in module 209 allows the website, and thus,the application and services provided by the application to be definedor constructed. The instructions in the server side plug-in module 209are made of a complied code. The code is run when a web request reachesthe web server 202. The server side plug-in module 209 then outputs anew client side markup page that is sent to the client device 30. As iswell known, this process is commonly referred to as rendering. Theserver side plug-in module 209 operates on “controls” that abstract andencapsulate the markup language, and thus, the code of the client sidemarkup page. Such controls that abstract and encapsulate the markuplanguage and operate on the webserver 202 include or are equivalent to“Servlets” or “Server-side plug ins” to name a few.

[0066] As is known, server side plug-in modules of the prior art cangenerate client side markup for visual rendering and interaction withthe client device 30. Three different approaches are provided herein forextending the server side plug-in module 209 to include recognition andaudible prompting extensions such as the exemplary client sideextensions discussed above. In a first approach illustratedschematically in FIG. 7, the current, visual, server side controls(which include parameters for visual display such as location forrendering, font, foreground color, background color, etc.) are extendedto include parameters or attributes for recognition and audiblyprompting for related recognition. Using speech recognition andassociated audible prompting by way of example, the attributes generallypertain to audible prompting parameters such as whether the promptcomprises inline text for text-to-speech conversion, playing of aprerecorded audio file (e.g. a wave file), the location of the data(text for text-to-speech conversion or a prerecorded audio file) foraudible rendering, etc. For recognition, the parameters or attributescan include the location of the grammar to be used during recognition,confidence level thresholds, etc. Since the server side plug-in module209 generates client side markup, the parameters and attributes for thecontrols for the server side plug-in module 209 relate to the extensionsprovided in the client side markup for recognition and/or audibleprompting.

[0067] The controls indicated at 300A in FIG. 7 are controls, which arewell-known in website application development or authoring tools such asASP, ASP+, ASP.Net, JSP, Javabeans, or the like. Such controls arecommonly formed in a library and used by controls 302 to perform aparticular visual task. Library 300A includes methods for generating thedesired client markup, event handlers, etc. Examples of visual controls302 include a “Label” control that provides a selected text label on avisual display such as the label “Credit Card Submission” 304 in FIG. 5.Another example of a higher level visual control 302 is a “Textbox”,which allows data to be entered in a data field such as is indicated at250 in FIG. 5. The existing visual controls 302 are also well-known. Inthe first approach for extending server side plug-in module controls toinclude recognition and/or audible prompting, each of the visualcontrols 302 would include further parameters or attributes related torecognition or audible prompting. In the case of the “label” control,which otherwise provides selected text on a visual display, furtherattributes may include whether an audio data file will be rendered ortext-to-speech conversion will be employed as well as the location ofthis data file. A library 300B, similar to library 300A, includesfurther markup information for performing recognition and/or audibleprompting. Each of the visual controls 302 is coded so as to providethis information to the controls 300B as appropriate to perform theparticular task related to recognition or audible prompting.

[0068] As another example, the “Textbox” control, which generates aninput field on a visual display and allows the user of the client device30 to enter information, would also include appropriate recognition oraudible prompting parameters or attributes such as the grammar to beused for recognition. It should be noted that the recognition or audibleprompting parameters are optional and need not be used if recognition oraudible prompting is not otherwise desired.

[0069] In general, if a control at level 302 includes parameters thatpertain to visual aspects, the control will access and use the library300A. Likewise, if the control includes parameters pertaining torecognition and/or audible prompting the control will access or use thelibrary 300B. It should be noted that libraries 300A and 300B have beenillustrated separately in order to emphasize the additional informationpresent in library 300B and that a single library having the informationof libraries 300A and 300B can be implemented.

[0070] In this approach, each of the current or prior art visualcontrols 302 are extended to include appropriate recognition/audibleprompting attributes. The controls 302 can be formed in a library. Theserver side plug-in module 209 accesses the library for markupinformation. Execution of the controls generates a client side markuppage, or a portion thereof, with the provided parameters.

[0071] In a second approach illustrated in FIG. 8, new visual,recognition/audible prompting controls 304 are provided such that thecontrols 304 are a subclass relative to visual controls 302, whereinrecognition/audible prompting functionality or markup information isprovided at controls 304. In other words, a new set of controls 304 areprovided for recognition/audible prompting and include appropriateparameters or attributes to perform the desired recognition or anaudible prompting related to a recognition task on the client device 30.The controls 304 use the existing visual controls 302 to the extent thatvisual information is rendered or obtained through a display. Forinstance, a control “SpeechLabel” at level 304 uses the “Label” controlat level 302 to provide an audible rendering and/or visual textrendering. Likewise, a “SpeechTextbox” control would associate a grammarand related recognition resources and processing with an input field.Like the first approach, the attributes for controls 304 include wherethe grammar is located for recognition, the inline text fortext-to-speech conversion, or the location of a prerecorded audio datafile that will be rendered directly or a text file throughtext-to-speech conversion. The second approach is advantageous in thatinteractions of the recognition controls 304 with the visual controls302 are through parameters or attributes, and thus, changes in thevisual controls 302 may not require any changes in the recognitioncontrols 304 provided the parameters or attributes interfacing betweenthe controls 304 and 302 are still appropriate. However, with thecreation of further visual controls 302, a correspondingrecognition/audible prompting control at level 304 may also have to bewritten.

[0072] A third approach is illustrated in FIG. 9. Generally, controls306 of the third approach are separate from the visual controls 302, butare associated selectively therewith as discussed below. In this manner,the controls 306 do not directly build upon the visual controls 302, butrather provide recognition/audible prompting enablement without havingto rewrite the visual controls 302. The controls 306, like the controls302, use a library 300. In this embodiment, library 300 includes bothvisual and recognition/audible prompting markup information and as suchis a combination of libraries 300A and 300B of FIG. 7.

[0073] There are significant advantages to this third approach. Firstly,the visual controls 302 do not need to be changed in content. Secondly,the controls 306 can form a single module which is consistent and doesnot need to change according to the nature of the speech-enabled control302. Thirdly, the process of speech enablement, that is, the explicitassociation of the controls 306 with the visual controls 302 is fullyunder the developer's control at design time, since it is an explicitand selective process. This also makes it possible for the markuplanguage of the visual controls to receive input values from multiplesources such as through recognition provided by the markup languagegenerated by controls 306, or through a conventional input device suchas a keyboard. In short, the controls 306 can be added to an existingapplication authoring page of a visual authoring page of the server sideplug-in module 209. The controls 306 provide a new modality ofinteraction (i.e. recognition and/or audible prompting) for the user ofthe client device 30, while reusing the visual controls' applicationlogic and visual input/output capabilities. In view that the controls306 can be associated with the visual controls 302 whereat theapplication logic can be coded, controls 306 may be hereinafter referredto as “companion controls 306” and the visual controls 302 be referredto as “primary controls 302”. It should be noted that these referencesare provided for purposes of distinguishing controls 302 and 306 and arenot intended to be limiting. For instance, the companion controls 306could be used to develop or author a website that does not includevisual renderings such as a voice-only website. In such a case, certainapplication logic could be embodied in the companion control logic.

[0074] A first exemplary set of companion controls 306 are furtherillustrated in FIG. 10. The set of companion controls 306 can be groupedas output controls 308 and input controls 310. Output controls 308provide “prompting” client side markups, which typically involves theplaying of a prerecorded audio file, or text for text-to-speechconversion, the data included in the markup directly or referenced via aURL. Although a single output control can be defined with parameters tohandle all audible prompting, and thus should be considered as a furtheraspect of the present invention, in the exemplary embodiment, the formsor types of audible prompting in a human dialog are formed as separatecontrols. In particular, the output controls 308 can include a“Question” control 308A, a “Confirmation” control 308B and a “Statement”control 308C, which will be discussed in detail below. Likewise, theinput controls 310 can also form or follow human dialog and include a“Answer” control 310A and a “Command” control 310B. The input controls310 are discussed below, but generally the input controls 310 associatea grammar with expected or possible input from the user of the clientdevice 30.

[0075] Although the question control 308A, confirmation control 308B,statement control 308C, answer control 310A, command control 310B, othercontrols as well as the general structure of these controls, theparameters and event handlers, are specifically discussed with respectto use as companion controls 306, it should be understood that thesecontrols, the general structure, parameters and event handlers can beadapted to provide recognition and/or audible prompting in the other twoapproaches discussed above with respect to FIGS. 7 and 8. For instance,the parameter “ClientToSpeechEnable”, which comprises one exemplarymechanism to form the association between a companion control and avisual control, would not be needed when embodied in the approaches ofFIGS. 7 and 8.

[0076] In a multimodal application, at least one of the output controls308 or one of the input controls 310 is associated with a primary orvisual control 302. In the embodiment illustrated, the output controls308 and input controls 310 are arranged or organized under a“Question/Answer” (hereinafter also “QA”) control 320. QA control 320 isexecuted on the web server 202, which means it is defined on theapplication development web page held on the web server using theserver-side markup formalism (ASP, JSP or the like), but is output as adifferent form of markup to the client device 30. Although illustratedin FIG. 10 where the QA control appears to be formed of all of theoutput controls 308 and the input controls 310, it should be understoodthat these are merely options wherein one or more may be included for aQA control.

[0077] At this point it may be helpful to explain use of the controls308 and 310 in terms of application scenarios. Referring to FIG. 11 andin a voice-only application QA control 320 could comprise a singlequestion control 308A and an answer control 310A. The question control308A contains one or more prompt objects or controls 322, while theanswer control 310A can define a grammar through grammar object orcontrol 324 for recognition of the input data and related processing onthat input. Line 326 represents the association of the QA control 320with the corresponding primary control 302, if used. In a multimodalscenario, where the user of the client device 30 may touch on the visualtextbox, for example with a “TapEvent”, an audible prompt may not benecessary. For example, for a primary control comprising a textboxhaving visual text forming an indication of what the user of clientdevice should enter in the corresponding field, a corresponding QAcontrol 320 may or may not have a corresponding prompt such as an audioplayback or a text-to-speech conversion, but would have a grammarcorresponding to the expected value for recognition, and event handlers328 to process the input, or process other recognizer events such as nospeech detected, speech not recognized, or events fired on timeouts (asillustrated in “Eventing” below).

[0078] In general, the QA control through the output controls 308 andinput controls 310 and additional logic can perform one or more of thefollowing: provide output audible prompting, collect input data, performconfidence validation of the input result, allow additional types ofinput such as “help” commands, or commands that allow the user of theclient device to navigate to other selected areas of the website, allowconfirmation of input data and control of dialog flow at the website, toname a few. In short, the QA control 320 contains all the controlsrelated to a specific topic. In this manner, a dialog is created throughuse of the controls with respect to the topic in order to inform toobtain information, to confirm validity, or to repair a dialog or changethe topic of conversation.

[0079] In one method of development, the application developer candefine the visual layout of the application using the visual controls302. The application developer can then define the spoken interface ofthe application using companion controls 306 (embodied as QA control320, or output controls 308 and input control 310). As illustrated inFIGS. 10 and 11, each of the companion controls 306 are then linked orotherwise associated with the corresponding primary or visual control302 to provide recognition and audible prompting. Of course if desired,the application developer can define or encode the application byswitching between visual controls 302 and companion controls 306,forming the links therebetween, until the application is completelydefined or encoded.

[0080] At this point, it may be helpful to provide a short descriptionof each of the output controls 308 and input controls 310. Detaileddescriptions are provided below for this embodiment in Appendix B.

[0081] Questions, Answers and Commands

[0082] Generally, as indicated above, the question controls 308A andanswer controls 310A in a QA control 320 hold the prompt and grammarresources relevant to the primary control 302, and related binding(associating recognition results with input fields of the client-sidemarkup page) and processing logic. The presence, or not, of questioncontrols 308A and answer controls 310A determines whether speech outputor recognition input is enabled on activation. Command controls 310B anduser initiative answers are activated by specification of the Scopeproperty on the answer controls 310A and command controls 310B.

[0083] In simple voice-only applications, a QA control 320 willtypically hold one question control or object 308A and one answercontrol or object 310A. Although not shown in the example below, commandcontrols 310B may also be specified, e.g. Help, Repeat, Cancel, etc., toenable user input which does not directly relate to the answering of aparticular question.

[0084] A typical ‘regular’ QA control for voice-only dialog is asfollows: <Speech:QA id=“QA_WhichOne” ControlsToSpeechEnable=“textBoxl”runat=“server” > <Question > <prompt> Which one do you want? </prompt></Question> <Answer > <grammar src=“whichOne.gram” /> </Answer></Speech:QA>

[0085] (The examples provided herein are written in the ASP.Netframework by example only and should not be considered as limiting thepresent invention.)

[0086] In this example, the QA control can be identified by its “id”,while the association of the QA control with the desired primary orvisual control is obtained through the parameter“ControlsToSpeechEnable”, which identifies one or more primary controlsby their respective identifiers. If desired, other well-known techniquescan be used to form the association. For instance, direct, implicitassociations are available through the first and second approachesdescribed above, or separate tables can be created used to maintain theassociations. The parameter “runat” instructs the web server that thiscode should be executed at the webserver 202 to generate the correctmarkup.

[0087] A QA control might also hold only a statement control 308C, inwhich case it is a prompt-only control without active grammars (e.g. fora welcome prompt). Similarly a QA control might hold only an answercontrol 310A, in which case it may be a multimodal control, whose answercontrol 310A activates its grammars directly as the result of an eventfrom the GUI, or a scoped mechanism (discussed below) for userinitiative.

[0088] It should also be noted that a QA control 320 may also holdmultiple output controls 308 and input controls 310 such as multiplequestion controls 308A and multiple answers controls 310A. This allowsan author to describe interactional flow about the same entity withinthe same QA control. This is particularly useful for more complexvoice-only dialogs. So a mini-dialog which may involve different kindsof question and answer (e.g. asking, confirming, giving help, etc.), canbe specified within the wrapper of the QA control associated with thevisual control which represents the dialog entity. A complex QA controlis illustrated in FIG. 11.

[0089] The foregoing represent the main features of the QA control. Eachfeature is described from a functional perspective below.

[0090] Answer Control

[0091] The answer control 310A abstracts the notion of grammars, bindingand other recognition processing into a single object or control. Answercontrols 310A can be used to specify a set of possible grammars relevantto a question, along with binding declarations and relevant scripts.Answer controls for multimodal applications such as “Tap-and-Talk” areactivated and deactivated by GUI browser events. The following exampleillustrates an answer control 310A used in a multimodal application toselect a departure city on the “mouseDown” event of the textbox“txtDepCity”, and write its value into the primary textbox control:<Speech:QA controlsToSpeechEnable=“txtDepCity” runat=“server”> <Answerid=“AnsDepCity” StartEvent=“onMouseDown” StopEvent=“onMouseUp” /><grammar src=“/grammars/depCities.gram”/> <bind value=“//sml/DepCity”targetElement=“txtCity” /> </Answer> </Speech:QA>

[0092] Typical answer controls 310A in voice-only applications areactivated directly by question controls 308A as described below.

[0093] The answer control further includes a mechanism to associate areceived result with the primary controls. Herein, binding places thevalues in the primary controls; however, in another embodiment theassociation mechanism may allow the primary control to look at orotherwise access the recognized results.

[0094] Question Control

[0095] Question controls 308A abstracts the notion of the prompt tags(Appendix A) into an object which contains a selection of possibleprompts and the answer controls 310A which are considered responses tothe question. Each question control 308A is able to specify which answercontrol 310A it activates on its execution. This permits appropriateresponse grammars to be bundled into answer controls 310A, which reflectrelevant question controls 308A.

[0096] The following question control 308A might be used in a voice-onlyapplication to ask for a Departure City: <Speech:QA id=“QADepCity”controlsToSpeechEnable=“txtDepCity” runat=“server” > <Question id=“Q1”Answers=“AnsDepCity” > <prompt> Please give me the departure city.</prompt> </Question> <Answer id=“AnsDepCity” ... /> </Speech:QA>

[0097] In the example below, different prompts can be called dependingon an internal condition of the question control 308A. The ability tospecify conditional tests on the prompts inside a question control 308Ameans that changes in wording can be accommodated within the samefunctional unit of the question control 308A. <Speech:QA id=“QADepCity”controlsToSpeechEnable=“txtDepCity” runat=“server” > <Question id=“Q1”Answers=“AnsDepCity” > <prompt count=“1”> Now I need to get thedeparture city. Where would you like to fly from? </prompt> <promptcount=“2”> Which departure city? </prompt> </Question> <Answerid=“AnsDepCity” ... /> </Speech:QA>

[0098] Conditional QA Control

[0099] The following example illustrates how to determine whether or notto activate a QA control based upon information known to theapplication. The example is a portion of a survey application. Thesurvey is gathering information from employees regarding the mode oftransportation they use to get to work.

[0100] The portion of the survey first asks whether or not the userrides the bus to work. If the answer is:

[0101] Yes, the next question asks how many days last week the usersrode the bus.

[0102] No, the “number of days rode the bus” question is bypassed.<asp:Label id=“lblDisplay1” text=“Do you ride the bus to work?”runat=“server”/> <asp:DropDownList id=”lstRodeBusYN“ runat=”server“><asp:ListItem selected=”true“>No</asp:ListItem><asp:ListItem>Yes</asp:ListItem> </asp:DropDownList> <Speech:QAid=”QA_RideBus ControlsToSpeechEnable=“lstRodeBusYN” runat=“server” ><SDN:Question id=“Q_RideBus” > <prompt bargeIn=“False”> Do you ride thebus to work? </prompt> </SDN:Question> <SDN:Answer id=“A_RideBus”autobind=“False” StartEvent=“onMouseDown” StopEvent=“onMouseUp”runat=“server” onClientReco=“ProcessRideBusAnswer” <grammar src=“...”/>  <--! “yes/no” grammar --> </SDN:Answer> </Speech:QA> <asp:Labelid=“lblDisplay2” enabled=“False” text=“How many days last week did youride the bus to work?” runat=“server”/> <asp:DropDownListid=“lstDaysRodeBus” enabled=“False” runat=“server”> <asp:ListItemselected=“true” >0</asp:ListItem> <asp:ListItem>1</asp:ListItem><asp:ListItem>2</asp:ListItem> <asp:ListItem>3</asp:ListItem><asp:ListItem>4</asp:ListItem> <asp:ListItem>5</asp:ListItem><asp:ListItem>6</asp:ListItem> <asp:ListItem>7</asp:ListItem></asp:DropDownList> <Speech:QA id=“QA_DaysRodeBus”ControlsToSpeechEnable=“lstDaysRodeBus” ClientTest=“RideBusCheck”runat=“server” > <Question id=“Q_DaysRodeBus” > <prompt bargeIn=“False”>How many days last week did you ride the bus to work? </prompt></SDN:Question> <SDN:Answer id=“A_DaysRodeBus” autobind=“False”StartEvent=“onMouseDown” StopEvent=“onMouseUp” runat=“server”onClientReco=“ProcessDaysRodeBusAnswer” <grammar src=“...” />  <--!“numbers” grammar --> </SDN:Answer> </Speech:QA> <scriptlanguage=“jscript”> function ProcessRideBusAnswer( ) {  <--! using SMLattribute of the Event object, determine yes or no answer -->  <--! thenselect the appropriate item in the dropdown listbox -->  <--! and enablethe next label and dropdown listbox if answer is “yes” -->  if <--!Answer is “yes” --> { lstRodeBusYN.selectedIndex=2lblDisplay2.enabled=“true” lstDaysRodeBus.enabled=“true” } } functionRideBusCheck( ) { if lstRodeBusYN.selectedIndex=“1” <--! this is no -->then return “False” endif } function ProcessDaysRodeBusAnswer( ) { <--!case statement to select proper dropdown item --> } </script>

[0103] In the example provided above, the QA control “QA_DaysRodeBus” isexecuted based on a boolean parameter “ClientTest”, which in thisexample, is set based on the function RideBusCheck( ). If the functionreturns a false condition, the QA control is not activated, whereas if atrue condition is returned the QA control is activated. The use of anactivation mechanism allows increased flexibility and improved dialogflow in the client side markup page produced. As indicated in Appendix Bmany of the controls and objects include an activation mechanism.

[0104] Command Control

[0105] Command controls 310B are user utterances common in voice-onlydialogs which typically have little semantic import in terms of thequestion asked, but rather seek assistance or effect navigation, e.g.help, cancel, repeat, etc. The Command control 310B within a QA control306 can be used to specify not only the grammar and associatedprocessing on recognition (rather like an answer control 310A withoutbinding of the result to an input field), but also a ‘scope’ of contextand a type. This allows for the authoring of both global andcontext-sensitive behavior on the client side markup.

[0106] As appreciated by those skilled in the art from the foregoingdescription, controls 306 can be organized in a tree structure similarto that used in visual controls 302. Since each of the controls 306 arealso associated with selected visual controls 302, the organization ofthe controls 306 can be related to the structure of the controls 302.

[0107] The QA controls 302 may be used to speech-enable both atomiccontrols (textbox, label, etc.) and container controls (form, panel,etc.) This provides a way of scoping behaviour and of obtainingmodularity of subdialog controls. For example, the scope will allow theuser of the client device to navigate to other portions of the clientside markup page without completing a dialog.

[0108] In one embodiment, “Scope” is determined as a node of the primarycontrols tree. The following is an example “help” command, scoped at thelevel of the “Pnl1” container control, which contains two textboxes.<asp:panel id=“Pnl1” ...> <asp:textbox id=“tb1” ... /> <asp:textboxid=“tb2” ... /> </asp:panel> <Speech:QA ... > <Command id=“HelpCmd1”scope=“Pnl1” type=“help” onClientReco=“GlobalGiveHelp( )” > <Grammarsrc=“grammars/help.gram”/> </Command> </Speech:QA> <script> functionGlobalGiveHelp( ) { ... } </script>

[0109] As specified, the “help” grammar will be active in every QAcontrol relating to “Pnl1” and its contents. The GlobalGiveHelpsubroutine will execute every time “help” is recognized. To overridethis and achieve context-sensitive behavior, the same typed command canbe scoped to the required level of context: <Speech:QA ... > <Commandid=“HelpCmd2” scope=“Tb2” type=“help” onClientReco=“SpecialGiveHelp()” > <Grammar src=“grammars/help.gram”/> </Command> </Speech:QA><script> function SpecialGiveHelp( ) { ... } </script> ConfirmationControl

[0110] The QA control 320 can also include a method for simplifying theauthoring of common confirmation subdialogs. The following QA controlexemplifies a typical subdialog which asks and then confirms a value:<Speech:QA id=“qaDepCity” controlsToSpeechEnable=“txtDepCity”runat=“server” > <!-- asking for a value --> <Question id=“AskDepCity”type=“ask” Answers=“AnsDepCity” > <prompt> Which city? </prompt></Question> <Answer id=“AnsDepCity” confirmThreshold=“60” > <grammarsrc=“grammars/depCity.gram” /> </Answer> <!-- confirming the value --><Confirm id=“ConfirmDepCity” Answers=“AnsConfDepCity” > <prompt> Did yousay <value targetElement=“txtDepCity/Text”>? </prompt> </Confirm><Answer id=“AnsConfDepCity” > <grammar src=“grammars/YesNoDepCity.gram”/> </Answer> </Speech:QA>

[0111] In this example, a user response to ‘which city?’ which matchesthe AnsDepCity grammar but whose confidence level does not exceed theconfirmThreshold value will trigger the confirm control 308. Moreflexible methods of confirmation available to the author includemechanisms using multiple question controls and multiple answercontrols.

[0112] In a further embodiment, additional input controls related to theconfirmation control include an accept control, a deny control and acorrect control. Each of these controls could be activated (in a mannersimilar to the other controls) by the corresponding confirmation controland include grammars to accept, deny or correct results, respectively.For instance, users are likely to deny be saying “no”, to accept bysaying “yes” or “yes+current value” (e.g., “Do you want to go toSeattle?” “Yes, to Seattle”), to correct by saying “no”+new value (e.g.,“Do you want to go to Seattle” “No, Pittsburgh”)

[0113] Statement Control

[0114] The statement control allows the application developer to providean output upon execution of the client side markup when a response isnot required from the user of the client device 30. An example could bea “Welcome” prompt played at the beginning of execution of a client sidemarkup page.

[0115] An attribute can be provided in the statement control todistinguish different types of information to be provided to the user ofthe client device. For instance, attributes can be provided to denote awarning message or a help message. These types could have differentbuilt-in properties such as different voices. If desired, differentforms of statement controls can be provided, i.e. a help control,warning control, etc. Whether provided as separate controls orattributes of the statement control, the different types of statementshave different roles in the dialog created, but share the fundamentalrole of providing information to the user of the client device withoutexpecting an answer back.

[0116] Eventing

[0117] Event handlers as indicated in FIG. 11 are provided in the QAcontrol 320, the output controls 308 and the input controls 310 foractions/inactions of the user of the client device 30 and for operationof the recognition server 204 to name a few, other events are specifiedin Appendix B. For instance, mumbling, where the speech recognizerdetects that the user has spoken but is unable to recognize the wordsand silence, where speech is not detected at all, are specified in theQA control 320. These events reference client-side script functionsdefined by the author. In a multimodal application specified earlier, asimple mumble handler that puts an error message in the text box couldbe written as follows: <Speech:QA controlsToSpeechEnable=“txtDepCit y”onClientNoReco=“OnMumble( )” runat=“server”> <Answer id=“AnsDepCity”StartEvent=“onMouseDown” StopEvent=“onMouseUp” > <grammarsrc=“/grammars/depCities.gram”/> <bind value=“//sml/DepCity”targetElement=“txtCity” /> </Answer> </Speech:QA> <script> functionOnMumble( ) { txtDepCity.value=“...recognition error...”; } </script>

[0118] Control Execution Algorithm

[0119] In one embodiment, a client-side script or module (hereinreferred to as “RunSpeech”) is provided to the client device. Thepurpose of this script is to execute dialog flow via logic, which isspecified in the script when executed on the client device 30, i.e. whenthe markup pertaining to the controls is activated for execution on theclient due to values contained therein. The script allows multipledialog turns between page requests, and therefore, is particularlyhelpful for control of voice-only dialogs such as through telephonybrowser 216. The client-side script RunSpeech is executed in a loopmanner on the client device 30 until a completed form in submitted, or anew page is otherwise requested from the client device 30.

[0120] It should be noted that in one embodiment, the controls canactivate each other (e.g. question control activating a selected answercontrol) due to values when executed on the client. However, in afurther embodiment, the controls can “activate” each other in order togenerate appropriate markup, in which case server-side processing may beimplemented.

[0121] Generally, in one embodiment, the algorithm generates a dialogturn by outputting speech and recognizing user input. The overall logicof the algorithm is as follows for a voice-only scenario:

[0122] 1. Find next active output companion control;

[0123] 2. If it is a statement, play the statement and go back to 1; Ifit is a question or a confirm go to 3;

[0124] 3. Collect expected answers;

[0125] 4. Collect commands;

[0126] 5. Play output control and listen in for input;

[0127] 6. Activate recognized Answer or Command object or, issue anevent if none is recognized;

[0128] 7. Go back to 1.

[0129] In the multimodal case, the logic is simplified to the followingalgorithm:

[0130] 1. Wait for triggering event—i.e., user tapping on a control;

[0131] 2. Collect expected answers;

[0132] 3. Listen in for input;

[0133] 4. Activate recognized Answer object or, if none, throw event;

[0134] 5. Go back to 1.

[0135] The algorithm is relatively simple because, as noted above,controls contain built-in information about when they can be activated.The algorithm also makes use of the role of the controls in thedialogue. For example statements are played immediately, while questionsand confirmations are only played once the expected answers have beencollected.

[0136] In a further embodiment, implicit confirmation can be providedwhereby the system confirms a piece of information and asks a questionat the same time. For example the system could confirm the arrival cityof a flight and ask for the travel date in one utterance: “When do youwant to go to Seattle?” (i.e. asking ‘when’ and implicitly confirming‘destination: Seattle’). If the user gives a date then the city isconsidered implicitly accepted since, if the city was wrong, users wouldhave immediately challenged it. In this scenario, it becomes clear thatthe knowledge of what a user is trying to achieve is vitally important:are they answering the question, or are they correcting the value, orare they asking for help? By using the role of the user input in thedialogue the system can know when to implicitly accept a value.

[0137] In summary, a dialog is created due to the role of the control inthe dialog and the relationship with other controls, wherein thealgorithm executes the controls and thus manages the dialog. Eachcontrol contains information based on its type which is used by theexecution algorithm to select (i.e. make active for execution) a givencontrol according to whether or not it serves a useful purpose at thatpoint in the dialog on the client. For example, confirmation controlsare only active when there is a value to confirm and the system does nothave sufficient confidence in that value to proceed. In a furtherimplementation, most of these built-in pieces of information can beoverridden or otherwise adapted by application developers.

[0138] The following table summarizes the controls, their correspondingrole in the dialog and the relationship with other controls.Relationship with other Control Role in dialogue controls Statementoutput: present (none) information to users Question output: ask selectsexpected Answers question as a response Confirmation output: confirm aselects potential input value obtained from controls as a response, theuser typically Accept, Deny, Correct Answer input: provide an selectedby answer to a Question/Confirmation question Command input: seek toscoped to other controls repair a dialog, or change the topic ofconversation Accept input: confirm a selected by a value in responseconfirmation to a confirmation Deny input: deny a value selected by a inresponse to a confirmation confirmation Correct input: correct aselected by a value in response confirmation to a confirmation QA(wrapper: contains all the controls related to a specific topic)

[0139] The use of these controls may be explained with an illustrationof a simple human/computer dialog. In the dialog below, each dialog turnon the part of the System or the User is characterized according to thecontrol (indicated in parentheses) which reflects its purpose in thedialog.

[0140] 1. System (Statement): “Welcome to the travel booking service”.

[0141] 2. System (Question): “Where would you like to go?”

[0142] 3. User (Answer): “San Francisco.”

[0143] 4. System (Confirmation): “Did you say Seattle?”

[0144] 5. User (Deny): “No.”

[0145] 6. System (Question): “Where would you like to go?”

[0146] 7. User (Answer): “San Francisco.”

[0147] 8. System (Confirmation): “Did you say Seattle?”

[0148] 9. User (Correct): “I said San Francisco.”

[0149] 10. System (Confirmation): “Did you say San Francisco?”

[0150] 11. User (Correct): “Yes.”

[0151] 12. System (Question): “When would you like to leave?”

[0152] 13. User (Command): “Help.”

[0153] Turn 1 is a statement on the part of the System. Since astatement control activates no answer controls in response, the systemdoes not expect input. The system goes on to activate a question controlat turn 2. This in turn activates a set of possible answer controls,including one which holds a grammar containing the cities availablethrough the service, including “San Francisco”, “Seattle”, etc., whichpermits the user to provide such a city in turn 3.

[0154] The user's turn 3 is misrecognized by the system. Although thesystem believes it has a value from an answer control for the city, itsconfidence in that value is low (rightly so, since it has recognizedincorrectly). This low confidence value in a just-received answercontrol is sufficient information for RunSpeech to trigger aconfirmation control on the part of the system, as generated at turn 4.The confirmation control in turn activates a deny control, a correctcontrol and an accept control and makes their respective grammarsavailable to recognize the user's next turn. User turns 5, 9 and 11illustrate example responses for these controls. Turn 5 of the usersimply denies the value “no”. This has the effect of removing the valuefrom the system, so the next action of RunSpeech is to ask the questionagain to re-obtain the value (turn 6).

[0155] Turns 7 and 8 return us to a confirmation control as with 3 and4.

[0156] User turn 9 is a correct control, which has again been activatedas a possible response to the confirmation control. A correct controlnot only denies the value undergoing confirmation, it also provides anew value. So user turn 9 is recognized by the system as a correctcontrol with a new value which, correctly this time, is recognized as“San Francisco”.

[0157] The system's confidence in the new value is low, however, and yetanother confirmation control is generated at turn 10. This in turnactivates accept, deny and correct controls in response, and user turn11 (“Yes” matches an accept control grammar. The recognition of theaccept control has the effect of ‘grounding’ the system's belief in thevalue which it is trying to obtain, and so RunSpeech is now able toselect other empty values to obtain. In turn 12, a new question controlis output which asks for a date value. The user's response this time(turn 13) is a command: “help”. Command controls are typically activatedin global fashion, that is, independently of the different questioncontrols and confirmation controls on the part of the system. In thisway the user is able to ask for help at any time, as he does in turn 13.Command controls may also be more sensitively enabled by a mechanismthat scopes their activation according to which part of the primarycontrol structure is being talked about.

[0158] Referring back to the algorithm, in one exemplary embodiment, theclient-side script RunSpeech examines the values inside each of theprimary controls and an attribute of the QA control, and any selectiontest of the QA controls on the current page, and selects a single QAcontrol for execution. For example, within the selected QA control, asingle question and its corresponding prompt are selected for output,and then a grammar is activated related to typical answers to thecorresponding question. Additional grammars may also be activated, inparallel, allowing other commands (or other answers), which areindicated as being allowable. Assuming recognition has been made and anyfurther processing on the input data is complete, the client-side scriptRunSpeech will begin again to ascertain which QA control should beexecuted next. An exemplary implementation and algorithm of RunSpeech isprovided in Appendix B.

[0159] It should be noted that the use of the controls and the RunSpeechalgorithm or module is not limited to the client/server applicationdescribed above, but rather can be adapted for use with otherapplication abstractions. For instance, an application such as VoiceXML,which runs only on the client device 30, could conceivably includefurther elements or controls such as question and answer provided aboveas part of the VoiceXML browser and operating in the same manner. Inthis case the mechanisms of the RunSpeech algorithm described abovecould be executed by default by the browser without the necessity forextra script. Similarly, other platforms such as finite state machinescan be adapted to include the controls and RunSpeech algorithm or moduleherein described.

[0160] Synchronization

[0161] As noted above, the companion controls 306 are associated withthe primary controls 302 (the existing controls on the page). As suchthe companion controls 306 can re-use the business logic andpresentation capabilities of the primary controls 302. This is done intwo ways: storing values in the primary controls 302 and notifying theprimary controls of the changes 302.

[0162] The companion controls 306 synchronize or associates their valueswith the primary controls 302 via the mechanism called binding. Bindingputs values retrieved from recognizer into the primary controls 302, forexample putting text into a textbox, herein exemplified with the answercontrol. Since primary controls 302 are responsible for visualpresentation, this provides visual feedback to the users in multimodalscenarios.

[0163] The companion controls 306 also offer a mechanism to notify theprimary controls 302 that they have received an input via therecognizer. This allows the primary controls 302 to take actions, suchas invoking the business logic. (Since the notification amounts to acommitment of the companion controls 306 to the values which they writeinto the primary controls 302, the implementation provides a mechanismto control this notification with a fine degree of control. This controlis provided by the RejectThreshold and ConfirmThreshold properties onthe answer control, which specify numerical acoustic confidence valuesbelow which the system should respectively reject or attempt to confirma value.)

[0164] A second exemplary set of companion controls 400 is illustratedin FIG. 12. In this embodiment, the companion controls 400 generallyinclude a QA control 402, a Command control 404, a CompareValidatorcontrol 406, a Custom Validator control 408 and a semantic map 410. Thesemantic map 410 schematically illustrated and includes semantic items412 that form a layer between the visual domain primary controls 402(e.g. HTML and a non-visual recognition domain of the companion controls400.

[0165] At this point, it should be emphasized that that although theorganization of the companion controls QA and Command is different thanthat of the first set of companion controls discussed above, thefunctionality remains the same. In particular, the QA control 402includes a Prompt property that references Prompt objects to perform thefunctions of output controls, i.e. that provide “prompting” client sidemarkups for human dialog, which typically involves the playing of aprerecorded audio file, or text for text-to-speech conversion, the dataincluded in the markup directly or referenced via a URL. Likewise, theinput controls are embodied as the QA control 402 and Command Control404 and also follow human dialog and include the Prompt property(referencing a Prompt object) and an Answer property that references atleast one Answer object. Both the QA control 402 and the Command control404 associate a grammar with expected or possible input from the user ofthe client device 30. The QA control 402 in this embodiment can thus beconsidered a question control, an answer control as well as a confirmcontrol and a statement control since it includes properties necessaryfor performing these functions.

[0166] Although the QA control 402, Command control 404, CompareValidator control 406 and Custom Validator control 408 and othercontrols as well as the general structure of these controls, theparameters and event handlers, are specifically discussed with respectto use as companion controls 400, it should be understood that thesecontrols, the general structure, parameters and event handlers can beadapted to provide recognition and/or audible prompting in the other twoapproaches discussed above with respect to FIGS. 7 and 8. For instance,the Semantic Map 410, which comprises another exemplary mechanism toform the association between the companion controls and visual control302, would not be needed when embodied in the approaches of FIGS. 7 and8.

[0167] At this point, it may be helpful to provide a short descriptionof each of the controls. Detailed descriptions are provided below inAppendix C.

[0168] QA Control

[0169] In general, the QA control 402 through the properties illustratedcan perform one or more of the following: provide output audibleprompting, collect input data, perform confidence validation of theinput result, allow confirmation of input data and aid in control ofdialog flow at the website, to name a few. In other words, the QAcontrol 402 contains properties that function as controls for a specifictopic.

[0170] The QA control 402, like the other controls, is executed on theweb server 202, which means it is defined on the application developmentweb page held on the web server using the server-side markup formalism(ASP, JSP or the like), but is output as a different form of markup tothe client device 30. Although illustrated in FIG. 12 where the QAcontrol appears to be formed of all of the properties Prompt, Reco,Answers, ExtraAnswers and Confirms, it should be understood that theseare merely options wherein one or more may be included for a QA control.

[0171] At this point it may be helpful to explain use of the QA controls402 in terms of application scenarios. Referring to FIG. 12 and in avoice-only application QA control 402 could function as a question andan answer in a dialog. The question would be provided by a Promptobject, while a grammar is defined through grammar object forrecognition of the input data and related processing on that input. AnAnswers property associates the recognized result with a SemanticItem412 in the Semantic Map 410 using an Answer object, which containsinformation on how to process recognition results. Line 414 representsthe association of the QA control 402 with the Semantic Map 410, and toa SemanticItem 412 therein. Many SemanticItems 412 are individuallyassociated with a visual or primary control 302 as represented by line418, although one or more SemanticItems 412 may not be associated with avisual control and used only internally. In a multimodal scenario, wherethe user of the client device 30 may touch on the visual textbox, forexample with a “TapEvent”, an audible prompt may not be necessary. Forexample, for a primary control comprising a textbox having visual textforming an indication of what the user of client device should enter inthe corresponding field, a corresponding QA control 402 may or may nothave a corresponding prompt such as an audio playback or atext-to-speech conversion, but would have a grammar corresponding to theexpected value for recognition, and event handlers to process the input,or process other recognizer events such as no speech detected, speechnot recognized, or events fired on timeouts.

[0172] In a further embodiment, the recognition result includes aconfidence level measure indicating the level of confidence that therecognized result was correct. A confirmation threshold can also bespecified in the Answer object, for example, as ConfirmThreshold equals0.7. If the confirmation level exceeds the associated threshold, theresult can be considered confirmed.

[0173] It should also be noted that in addition, or in the alternative,to specifying a grammar for speech recognition, QA controls and/orCommand controls can specify Dtmf (dual tone modulated frequency)grammars to recognize telephone key activations in response to promptsor questions. Appendix C provides details of a Dtmf object that appliesa different modality of grammar (a keypad input grammar rather than, forexample, a speech input grammar) to the same question. Some of theproperties of the Dtmf object include Preflush, which is a flagindicating if “type-ahead” functionality is allowed in order that theuser can provide answers to questions before they are asked. Otherproperties include the number of milliseconds to wait for receiving thefirst key press, InitialTimeOut, and the number of milliseconds to waitbefore adjacent key presses, InterdigitTimeOut. Client-side scriptfunctions can be specified for execution through other properties, forexample, when no key press is received, OnClientSilence, or when theinput is not recognized, OnClientNoReco, or when an error is detectedOnClientError.

[0174] At this point it should be noted that when a Semanticitem 412 ofthe Semantic map 410 is filled, through recognition for example, speechor Dtmf, several actions can be taken. First, an event can be issued orfired indicating that the value has been “changed”. Depending on if theconfirmation level was met, another event that can be issued or firedincludes a “confirm” event that indicates that the correspondingsemantic item has been confirmed. These events are used for controllingdialog.

[0175] The Confirms property can also include answer objects having thestructure similar to that described above with respect to the Answersproperty in that it is associated with a SemanticItem 412 and caninclude a ConfirmThreshold if desired. The Confirms property is notintended to obtain a recognition result per se, but rather, to confirm aresult already obtained and ascertain from the user whether the resultobtained is correct. The Confirms property is a collection of Answerobjects used to assert whether the value of a previously obtained resultwas correct. The containing QA's Prompt object will inquire about theseitems, and obtains the recognition result from the associatedSemanticItem 412 and forms it in a question such as “Did you saySeattle?” If the user responds with affirmation such as “Yes”, theconfirmed event is then fired. If the user responds in the negative suchas “No”, the associated SemanticItem 412 is cleared.

[0176] It should be noted in a further embodiment, the Confirms propertycan also accept corrections after a confirmation prompt has beenprovided to the user. For instance, in response to a confirmation prompt“Did you say Seattle?” the user may respond “San Francisco” or “No, SanFrancisco”, in which case, the QA control has received a correction.Having information as to which SemanticItem is being confirmed throughthe Answer object, the value in the SemanticItem can be replaced withthe corrected value. It should also be noted that if desired,confirmation can be included in a further prompt for information such as“When did you want to go to Seattle?”, where the prompt by the systemincludes a confirmation for “Seattle” and a further prompt for the dayof departure. A response by the user providing a correction to the placeof destination would activate the Confirms property to correct theassociated semantic item, while a response with only a day of departurewould provide implicit confirmation of the destination.

[0177] The ExtraAnswers property allows the application author tospecify Answer objects that a user may provide in addition to a promptor query that has been made. For instance, if a travel oriented systemprompts a user for a destination city, but the user responds byindicating “Seattle tomorrow”, the Answers property that initiallyprompted the user will retrieve and therefore bind the destination city“Seattle” to the appropriate SemanticItem, while the ExtraAnswersproperty can process “Tomorrow” as the next succeeding day (assumingthat the system knows the current day), and thereby, bind this result tothe appropriate SemanticItem in the Semantic Map. The ExtraAnswersproperty includes one or more Answer objects defined for possible extrainformation the user may also state. In the example provided above,having also retrieved information as to the day of departure, the systemwould then not need to reprompt the user for this information, assumingthat the confirmation level exceeded the corresponding ConfirmThreshold.If the confirmation level did not exceed the corresponding threshold,the appropriate Confirms property would be activated.

[0178] Command Control

[0179] Command controls 404 are user utterances common in voice-onlydialogs which typically have little semantic import in terms of thequestion asked, but rather seek assistance or effect navigation, e.g.help, cancel, repeat, etc. The Command control 404 can include a Promptproperty to specify a prompt object. In addition, the Command control404 can be used to specify not only the grammar (through a Grammarproperty) and associated processing on recognition (rather like anAnswer object without binding of the result to an SemanticItem), butalso a ‘scope’ of context and a type. This allows for the authoring ofboth global and context-sensitive behavior on the client side markup.The Command control 404 allows additional types of input such as “help”commands, or commands that allow the user of the client device tonavigate to other selected areas of the website.

[0180] CompareValidator Control

[0181] The CompareValidator control compares two values according to anoperator and takes an appropriate action. The values to be compared canbe of any form such as integers, strings of text, etc. TheCompareValidator includes a property SematicItemtoValidate thatindicates the SemanticItem that will be validated. The SemanticItem tobe validated can be compared to a constant or another SemanticItem,where the constant or other SemanticItem is provided by propertiesValuetoCompare and SematicItemtoCompare, respectively. Other parametersor properties associated with the CompareValidator include Operator,which defines the comparison to be made and Type, which defines the typeof value, for example, integer or string of the semantic items.

[0182] If the validation associated with the CompareValidator controlfails, a Prompt property can specify a Prompt object that can be playedinstructing the user that the result obtained was incorrect. If uponcomparison the validation fails, the associated SemanticItem defined bySematicItemtoValidate is indicated as being empty, in order that thesystem will reprompt the user for a correct value. However, it may behelpful to not clear the incorrect value of the associated SemanticItemin the Semantic Map in the event that the incorrect value will be usedin a prompt to the user reiterating the incorrect value. TheCompareValidator control can be triggered either when the value of theassociated SemanticItem changes value or when the value has beenconfirmed, depending on the desires of the application author.

[0183] CustomValidator Control

[0184] The CustomValidator control is similar to the CompareValidatorcontrol. A property SematicItemtoValidate indicates the SemanticItemthat will be validated, while a property ClientValidationFunctionspecifies a custom validation routine through an associated function orscript. The function would provide a Boolean value “yes” or “no” or anequivalent thereof whether or not the validation failed. A Promptproperty can specify a Prompt object to provide indications of errors orfailure of the validation. The CustomValidator control can be triggeredeither when the value of the associated SemanticItem changes value orwhen the value has been confirmed, depending on the desires of theapplication author.

[0185] Control Execution Algorithm

[0186] As in the previous set of controls, a client-side script ormodule (herein referred to as “RunSpeech”) is provided to the clientdevice for the controls of FIG. 12. Again, the purpose of this script isto execute dialog flow via logic, which is specified in the script whenexecuted on the client device 30, i.e. when the markup pertaining to thecontrols is activated for execution on the client due to valuescontained therein. The script allows multiple dialog turns between pagerequests, and therefore, is particularly helpful for control ofvoice-only dialogs such as through telephony browser 216. Theclient-side script RunSpeech is executed in a loop manner on the clientdevice 30 until a completed form is submitted, or a new page isotherwise requested from the client device 30.

[0187] Generally, in one embodiment, the algorithm generates a dialogturn by outputting speech and recognizing user input. The overall logicof the algorithm is as follows for a voice-only scenario (reference ismade to Appendix C for properties or parameters not otherwise discussedabove):

[0188] 1. Find the first active (as defined below) QA, CompareValidatoror CustomValidator control in speech index order.

[0189] 2. If there is no active control, submit the page.

[0190] 3. Otherwise, run the control.

[0191] A QA is considered active if and only if:

[0192] 1. The QA's clientActivationFunction either is not present orreturns true, AND

[0193] 2. If the Answers property collection is non empty, the State ofall of the SemanticItems pointed to by the set of Answers is Empty OR

[0194] 3. If the Answers property collection is empty, the State atleast one SemanticItem in the Confirm array is NeedsConfirmation.

[0195] However, if the QA has PlayOnce true and its Prompt has been runsuccessfully (reached OnComplete) the QA will not be a candidate foractivation.

[0196] A QA is run as follows:

[0197] 1. If this is a different control than the previous activecontrol, reset the prompt Count value.

[0198] 2. Increment the Prompt count value

[0199] 3. If PromptSelectFunction is specified, call the function andset the Prompt's inlinePrompt to the returned string.

[0200] 4. If a Reco object is present, start it. This Reco shouldalready include any active command grammar.

[0201] A Validator (either a CompareValidator or a CustomValidator) isactive if:

[0202] 1. The SemanticItemToValidate has not been validated by thisvalidator and its value has changed.

[0203] A CompareValidator is run as follows:

[0204] 1. Compare the values of the SemanticItemToCompare orValueToCompare and SemanticItemToValidate according to the validator'sOperator.

[0205] 2. If the test returns false, empty the text field of theSemanticItemToValidate and play the prompt.

[0206] 3. If the test returns true, mark the SemanticItemToValidate asvalidated by this validator.

[0207] A CustomValidator is run as follows:

[0208] 1. The ClientValidationFunction is called with the value of theSemanticItemToValidate.

[0209] 2. If the function returns false, the semanticItem cleared andthe prompt is played, otherwise as validated by this validator.

[0210] A Command is considered active if and only if:

[0211] 1. It is in Scope, AND

[0212] 2. There is not another Command of the same Type lower in thescope tree.

[0213] In the multimodal case, the logic is simplified to the followingalgorithm:

[0214] 1. Wait for triggering event—i.e., user tapping on a control;

[0215] 2. Collect expected answers;

[0216] 3. Listen in for input;

[0217] 4. Bind result to SemanticItem, or if none, throw event;

[0218] 5. Go back to 1.

[0219] In a multi-model environment, it should be noted that if the usercorrects the text box or other input field associated with a visualpresentation of the result, the system can update the associatedSemanticItem to indicate that the value has been confirmed.

[0220] In a further embodiment, controls are provided that enableapplication authors to create speech applications that handle telephonytransactions. In general, the controls implement or invoke well-knowntelephony transactions such as ECMA (European Computer ManufacturesAssociated) CSTA (Computer Supported Telecommunication Application)messages, eventing and services. As is known, CSTA specifies applicationinterfaces and protocols for monitoring and controlling calls anddevices in a communication network. These calls and devices may supportvarious media and can reside in various network environments such as IP,Switched Circuit Networks and mobile networks.

[0221] In the illustrated embodiment, the controls available to theapplication author include a SmexMessage control (SMEX-Simple MessageExchange), a TransferCall control, a MakeCall control, a DisconnectCallcontrol and an AnswerCall control. Like the controls described above,these controls can be executed on the server so as to generateclient-side markup that when executed on the client device perform thedesired telephony transaction.

[0222] Referring to FIG. 4, the client-side markup generated by server202 can be executed by voice browser 216, which in turn providestelephony transactions instructions (e.g. CSTA service calls) to themedia server 214 and gateway 210 as necessary to perform the desiredtelephony transaction. Appendix C provides detailed informationregarding each of the properties available in the controls. The controlsare commonly used in a voice-only mode such as by voice browser 216 inFIG. 4; however, it should be understood that applications can bewritten also to be executed in an multi-modal client device.

[0223]FIG. 12 schematically illustrates the call controls at 407. Thecall controls 407 described further below are generally used inconjunction with the controls described above such as the QA control402, Command control 404 and/or validators 406 and 408 to provide audioprompting, if necessary, and perform recognition so as to performdesired telephony transactions.

[0224] The SmexMessage control allows application authors to send andreceive raw CTSA messages. Like the controls discussed above, thecall-related controls include a SpeechIndex property that controls theorder of the object within the RunSpeech algorithm. Since the number andtypes of events generated by sending a message with the SmexMessagecontrol is unknown, the application author should be careful about whenthe RunSpeech algorithm can continue.

[0225] A required property of the SmexMessage control is the CSTA XMLmessage to be sent. Optional client-side functions can be called beforethe message is sent in order to modify the message, or a client-sidefunction that is called when a SMEX object receives a SMEX event.SmexMessage control may be used to receive incoming telephone calls.

[0226] The call-related server-side controls discussed below deal with asingle device and a single active call at any given time. If theapplication author needs to monitor more than one device or handle morethan one active call, SmexMessage control can be used by the applicationauthor to provide code to handle CSTA messages.

[0227] The TransferCall control is used to transfer the current callusing CSTA SingleStepTransfer service. Required properties include adevice identifier associated with the transfer to endpoint. Otherproperties can include client-side functions to be called when the callis transferred or when CSTA returns a failed event. In addition, aserver-side event can be issued when the called is transferred.

[0228] The MakeCall control makes an outbound call to a given number ona given device when the RunSpeech algorithm runs this object. Requiredproperties include an identifier device that the control will use toplace the outbound call and the phone number to dial. The server-sideevents can be issued when a call is connected. Likewise, client-sideevents can be called when the call is connected or when the call failsas indicated by a CSTA message returning a failed event.

[0229] The DisconnectCall control allows application authors todisconnect or terminate telephone calls using CSTA ClearConnectionservice. If desired, a server-side event can be issued when the call isdisconnected and/or a client-side function can be called when the callis disconnected.

[0230] The AnswerCall control answers incoming calls on a given deviceusing CSTA AnswerCall Service. In a manner similar to the DisconnectCallcontrol discussed above, a server-side event can be issued when the callis connected, and/or a client-side function can be called when the callis connected.

[0231] Having described above QA control 402, Command control 404,CompareValidator control 406 and CustomValidator control 408, at thispoint it should be noted that one or more of these controls can begrouped or formed as an application control 430 as also illustrated inFIG. 12. In general, an application control 430 provides a means to wrapcommon speech scenarios in one control. In particular, an applicationcontrol 430 can include one or more QA controls 402, one or more of thevalidator controls 406, 408 and one or more Command controls 404 asdesired. An application control 430 would include all necessary prompts,for example, a prompt to solicit a question, to confirm a recognizedresult, or to specify that the recognized result is in error due tooperation of a compare validator, etc. Commonly, application control 430would also reference one or more SemanticItems 412 in the Semantic map410 in order that the recognized results are placed in the Semantic map410 with confirmation and validation performed as required, or asdesired. In short, an application control 430, which can take manydifferent forms, such as illustrated in Appendix D, allows theapplication author to rapidly develop an application by usingapplication controls 430 rather than manually coding all the necessarysyntax to perform a function, confirm the recognized result as well asperform any form of validation. The application control 430 receivesparameters through properties that allows the application control 430 togenerate the corresponding syntax of QA controls 402, Command controls404, CustomValidator controls 408, CompareValidator controls 406 as ifthese controls were manually coded. This use of application controls 430allows rapid development of a desired speech-enabled application.

[0232] In the illustrative embodiment as described in Appendix D, anapplication control is derived from one of two base classesBasicApplicationControl or ApplicationControl. Each class has associatedtherewith properties, which generally relate to information that is usedin order to generate the syntax using QA controls, CompareValidatorcontrols, CustomValidator controls and/or Command controls. TheBasicApplicationControl includes properties that generally relate toasking a question and obtaining recognized results. This includes makinga prompt (i.e. does the basic data acquisition) and specifyingparameters such as BabbleTimeout, Bargein, if desired, as well as aproperty to be passed to all relevant internal QA controls that are usedto process recognized results for words that do not impart semanticmeaning. BasicApplicationControl also includes a property that specifiesa client-side function that allows authors to select and/or modify aprompt string prior to playback. Although prompts could be encodeddirectly in the application control, in a further embodiment, allprompts are organized in a list, which can be selected as a functiondenoted in Appendix D as PromptSelectFunction.

[0233] The ApplicationControl inherits all the properties associatedwith the BasicApplicationControl and contains further properties that anapplication control can support. For instance, for an applicaton controlthat is derived from the ApplicationControl class, internal QA controlscreated by the application control can specify a common threshold foraccepting or rejection utterances pertaining to confirmation. Otherproperties that can be included in an application control includespecifying the name of the event that starts or stops recognition inmulti-modal mode such as on activation of a mouse button, for example,when depressed to start acquiring user voice input, whereas when themouse button is released acquisition is stopped. Yet other propertiesspecify the identifiers of the visual control that will issue thecorresponding start and stop events. It is worth noting that theBasicApplicationControl class and the ApplicationControl class may bemerged to form a single class, as is known in the art. Other morespecific base classes can also be used for specific applications and/orin order to generate customized application controls.

[0234] At this point it may be helpful to discuss various applicationcontrols including an application control to retrieve a natural number,an application control to retrieve a string of numbers/letters and anapplication to navigate a table, which can also be used to select anitem from a one column table or list. These application controls will bediscussed generally highlighting important conceptual elements whereAppendix D provides additional details or options that can be invoked.

[0235] Beginning first with the application control NaturalNumbercontrol, which is used to retrieve a natural number herein exemplifiedas between 0 and 999,999. The NaturalNumber control is derived from theApplicationControl and inherits all properties associated with thisclass as well as includes additional properties to specify the visualcontrol in which it is associated through the property SemanticItem,which identifies the ID of the SemanticItem to receive the value spokenby the user. In general, the NaturalNumber control will provide codecomprising a QA control that includes a prompt object as a question, ananswer object for the specified the SemanticItem, confirm object forperforming confirmation and one or more validating functions such asimplemented through CompareValidator controls to compare the valuerecognized to a LowerBound property and/or an UpperBound property. Ifboth a LowerBound and an UpperBound are specified, code can be generatedspecifying two CompareValidator controls, one comparing the value to theLowerBound and a second comparing the value to the UpperBound.

[0236] In general, the NaturalNumber application control will generatecode that upon execution first activates the QA control to prompt for avalue. Upon receipt of the value, if confirmation is necessary, theconfirm object will be activated. Validation of the value through thevalidator functions can be executed after a change in the value in theSemanticItem or after confirmation as selected by the application authorthrough a ValidationEvent property. The confirm and validation may beexecuted through a suitable dialog flow that is automatically generatedupon instantiating the NaturalNumber control. Thus, an author need notgenerate a customized dialog flow in order to get a number from a user.It should be mentioned that the execution flow described in Appendix Dfor this control as well as others may include SpeechIndex values thatappear to perform confirmation prior to prompting the question; however,activation of these objects does not sequentially follow the assignedSpeechIndex, but rather, is determined upon whether the action, such asconfirmation of a received value is necessary.

[0237] The AlphaDigit Control retrieves a string of numbers and/orletters. The AlphaDigit control is derived from the ApplicationControland inherits all properties associated with this class as well asincludes additional properties to specify the visual control in which itis associated through the property SemanticItem, which identifies the IDof the SemanticItem to receive the value spoken by the user. In general,the AlphaDigit control will provide code comprising a QA control thatincludes a prompt object as a question, an answer object for thespecified the SemanticItem and confirm object for performingconfirmation.

[0238] Other unique properties of the AlphaDigit control includes anInputMask property that defines the format of the input to theAlphaDigit control. In particular, the InputMask can define for eachposition of the input received by a wildcard or a range denoted hereinby brackets. Separate wildcards are provided for alphabeticalcharacters, numerical characters or either for alphabetical or numericalcharacters. The range of acceptable characters can be listed separatelywithin the brackets, for example, “[123]” for “1”, “2” or “3” or throughthe use of a hyphen “[1-3]”, or through combinations such as “[1-3a-c]”, which would allow “1”, “2”, “3”, “a”, “b” or “c”. In one embodiment,a grammar is automatically generated based on the Input Mask. Thus, forexample, the application may be configured to recognize a user speaking“1”, “2” or “3” that corresponds to the input mask.

[0239] In general, the AlphaDigit application control will generate codethat upon execution first activates the QA control to prompt for avalue. Upon receipt of the value, if confirmation is necessary, theconfirm object will be activated.

[0240] Other application controls provided in

[0241] Appendix D include a Currency application control to retrieve amonetary amount such as in dollars, various numerical information inselected formats such as a Phone application control to retrieve a phonenumber such as a 10 digit U.S. phone number, a Zipcode applicationcontrol to retrieve a U.S. zipcode/zipcode extension, aSocialSecurityNumber application control to retrieve a U.S. SocialSecurity number, as well as a Date application control for retrieving acalendar date and a YesNo application control for retrieving a yes or noanswer. Many of these application controls implement multipleSemanticItems each having a corresponding question prompt, and separateconfirm objects that can be activated for each SemanticItem, ifnecessary.

[0242] Retrieving specific number sequences, such as for telephonenumbers, social security numbers and credit card numbers can implementspecific controls as desired. For example, a user may be asked for allor a portion of a number sequence and be prompted until the control hasreceived all the necessary digits. After the digits have been received,the control may confirm the entire sequence. If the sequence is acceptedby the user, the control may exit. Otherwise, the control may confirmthe sequence portion-by-portion where the user can accept or denyshortersequences of characters and/or digits or even individual characters ordigits. The control may then ask for portions that were denied by theuser. For example a social security number could confirm three portionsof three digits, two digits and four digits, respectively, as is thetypical format for a social security number. Another example includesobtaining dates by date, month and year portions, particularly ifrecognition of the full date is unsuccessful.

[0243]FIG. 14 illustrates a system 500 for generating aDataTableNavigator application control that allows a user to navigatethrough and render data in a table of information by using voicecommands. In order to generate this application control, tableinformation 502 is supplied to a suitable code generator 504. Using thetable information 502, code generator 504 generates navigator controlcode or parameters 506. In one mode of operation, table information 502includes a data source, header fields of the table and content fields ofthe table. The data source identifies where a particular table isstored, while the header fields and contents fields identify informationwithin the table. Table information 502 may also include othercustomized items such as specified grammars, prompts and others.

[0244] Alternatively, table information 502 may refer to a simple listof selectable choices. The list may contain a single or multiple columnswherein the user can select an item from a particular column. An actionmay then be performed on the selection. For example, a user may select adeparture city for travel plans from a list of cities and a semanticitem is updated based on the user's choice. The selection is performedsimilar to the dialog examples described below with reference to FIGS.18-19.

[0245] Code generator 504 may include various default configurations inorder to easily generate and implement the navigator control code 506.For example, code generator 504 may be configured to recognize commandssuch as “next” and “previous” using a default grammar. Additionally,code generator 504 may automatically generate a grammar based on headerfields, content headings and/or a list of selectable choices in thetable. Accordingly, an author may rapidly develop table navigation codethat contains dialog flow, grammars and prompts that are automaticallygenerated and/or customized based on the author's input.

[0246] Table 1 shows various default commands that may be used whengenerating navigator control code 506. In order to provide morecustomized table navigation in addition to the default commands in table1, an author may also enter other table specific information that willaid in generating navigation control. For example, an author may specifya grammar pertaining to specific headings of content fields. TABLE 1Command Action First/Home Navigate to first row. Play header field (orsuitable prompt) of current position. Last Navigate to last row. Playheader field (or suitable prompt) of current position. Previous Navigateto the previous row. If already on first row before issuing command,play the “Previous On First Error Message”, else play header field (orsuitable prompt) of current position. Next Navigate to the next row. Ifalready on last row before issuing command, play Next On Last ErrorMessage, else play header field (or suitable prompt) of currentposition. Read Play data in content fields (or other defined prompt).Play header field (or suitable prompt) of current position. Header Playheader fields (or other prompt listing commands). Wait for next command.Exit/Cancel Terminate control execution. Repeat Repeat last prompt(whether it was a content or header prompt). Wait for next command.Select Associate current row or column with a value (i.e. a semanticitem) and terminate execution

[0247]FIG. 15 illustrates various tasks that may be completed by anauthor 510 in order to generate navigator control code 506. Task 512includes identifying a table source for a particular table. This tablewill be used in order to generate the navigator control code 506. Oncethe table has been identified, various other tasks may be provided byauthor 510 in order to customize the navigator control for the table.Task 514 includes identifying header and content fields for the table.The header fields identify information that is included in the contentfields. For example, in one embodiment, the header fields can include acity and state and the content fields include weather conditions for thecity and state combinations. Given the header and content fields, anauthor 510 may identify a header field grammar (task 518) and/or contentfields grammars (task 516). For example, a header field grammar mayinclude city/state combinations and the content field grammar mayinclude identifiers of particular weather information such as lowtemperature, high temperature and sky conditions. The grammars may alsobe generated automatically based on text in the headings for columns androws identified during task 514. In one mode, the grammars providerecognition only for selected words from a larger grammar. Author 510may also identify alternative choices (i.e. synonyms) to be recognizedfor the column and row headings. In the case where author 510 identifiesa list of selectable choices, the control may be configured to update avalue (i.e. a SemanticItem 412 in the Semantic Map 410) based on auser's selection.

[0248] Author 510 may also identify various prompts at task 520. Theseprompts may introduce data in a table or identify commands available fora user when rendering data in the table. Additionally, the prompts mayinclude various contexts that are used when rendering data in the table.In order to generate the navigator control code, task 522 is performedby author 510, which instantiates the navigator control and binds thetable to the navigator control. After task 522 is completed, a user mayuse the navigator control to navigate through a table and renderinformation in the table.

[0249]FIG. 16 illustrates an exemplary table 530 for which codegenerator 504 may generate suitable navigator control code 506. Table530 includes a plurality of columns 532 and a plurality of rows 534.Each row 534 includes a header field (or fields) 536 and content field538 comprising one or more values in the columns 532. Header field 536identifies the information contained in content field 538. For example,header field 536 may include a city and/or state while content fields538 include weather information pertaining to the particular city and/orstate. A number of commands may be generated in order to navigatethrough table 530. For example, a “next” command 540 will move aposition within table 530 to the next row, while a “previous” command542 will move the position in table 530 to the previous row. A “read”command 544 will read the content fields 538 for the particular row andcolumn command 546 will render a specific column for a particular row.An “exit” command 548 exits out of the navigation controls.

[0250]FIG. 17 illustrates a flow diagram of an exemplary method used fornavigating through a table implemented by a navigator control. Method560 begins at step 562 wherein a data header field 536 is read.Alternatively, another prompt or table identification information may beread at this step. At step 564, a command is received from the user. Avariety of different commands may be received in order to provide tablenavigation and render data to a user. At step 566, it is determinedwhether the user has entered a content command, which requestsinformation within one or more of the content fields. For example, theuser may ask to read an entire row or read a specific column within therow. If a content command is entered, the appropriate content isrendered at step 568 and the method 560 returns to step 564 to await anadditional command from the user. Alternatively, for example if theentire row has been rendered, method 560 may return to step 562 whereinthe position in the table is incremented and a next header field isread.

[0251] At step 570, it is determined whether a header command has beenentered by a user. If a header command is received, method 560 proceedsto step 572 wherein a portion or all of the header fields are read. Forexample, the header fields may include a list of choices and all of thechoices will be read to the user. After the header fields have beenread, method 560 returns to step 564 to wait for another command. If aheader command is not received, the method 560 determines at step 574whether a navigation command is received. For example, a user may issuea command to update the position to the next row or the previous row. Ifa navigation command is received at step 574, a position in table 530will be updated to the appropriate row at step 576. Method 560 thenreturns to step 562.

[0252] At step 578, it is determined whether an exit command has beenreceived. Upon receipt of an exit command, the method ends at step 580.If the user input is not recognized, the method 560 proceeds to step582, wherein the user can be notified that an error has occurred. Aftererror step 582, method 560 returns to step 562. It is worth noting thatmethod 560 is exemplary and other methods and/or commands may beutilized in accordance with the present invention. For example, usersilence may be interpreted to move to a next position in the table.Additionally, the navigator control may be adaptable to receive inputcommands at any time and need not wait to render data or otherwiseperform an action to perform an action associated with the inputcommand.

[0253]FIGS. 18-19 illustrate exemplary operation of a navigator controlfor table information that has been generated based on the author'sinput. With regard to FIG. 18, table 600 shows weather information andincludes data header fields 602 for a city and state and data contentfields 604 including a low temperature, a high temperature and skyconditions. To provide customized control, an author may provide agrammar that specifies cities, states, the low temperature, hightemperature, and sky conditions. Alternatively, the grammars may begenerated by the navigator control based on row and column headings.

[0254] An example dialogue 606 is illustrated wherein the control beginsan interaction with a user by requesting a location for weatherinformation. This request may be a default prompt or specified by anauthor. Once a user selects a location, the location is confirmed by thecontrol. The user then requests the weather, and the default contentthat is read is the low temperature and the high temperature. A user mayalso request the sky conditions based on a current position in thetable. Dialogue 606 also demonstrates using context to render datawithin table 600. Context refers to the rendering of data in addition tothe data stored in the table. For example, table 600 only contains thedata “clear” for the sky conditions in Spokane, Wash.; the contextincludes “The sky conditions are . . . ” to provide a more suitablepresentation of data to the user. Other contexts can be developed by anauthor.

[0255]FIG. 19 illustrates another example of table navigation and adialogue between a user and a computer to render e-mail messages. Table620 includes header field 622 and content fields 624. As shown in thedialogue section 626, a computer may begin by rendering some initialinformation. In this case, the control has indicated to the user thatthe user has five new messages. A user requests the first message, whichdefaults to read the first header field. In this case, the header fieldis the subject of the message, which is rendered to the user. Next inthe dialogue 626, the user requests that the message be read.Accordingly, the control responds by reading the message. The user theninquires as to who the sender of the message was and the controlresponds with the appropriate sender information from table 620. Theuser then issues a next command, which moves the position to the nextmessage in the table. The control then renders the next subject (headerfield) in the table. The user has then entered an exit command, which isinterpreted to exit the control.

[0256] From the foregoing, a method and system are provided forgenerating mark-up for client side devices for speech-enabledapplications. The same set of controls can be used in three differentforms of interaction including Voice-only, Tap-and-talk (multi-modal)and Hands-free (multi-modal). In Voice-only, dialogs are provided on aGUI-less browser such as for telephony applications. This kind ofapplication is driven by a dialog-flow manager that runs on the client(RunSpeech). In Tap-and-talk multi-modal dialogs contain a usable GUIwithout speech output. System prompts are generally not provided and theinteraction is managed by the user's click events on the GUI. InHands-free multi-modal, dialogs use a GUI display and speech input andoutput. The dialog may be authored for Tap-and-talk, but may still usethe RunSpeech algorithm, or other speech controls features, to enablesystem driven voice prompting, while confirmation is provided visually.Switching between multi-modal/hands-free and voice-only is done bydetecting the type of client the controls are talking to. Generally,Hands-free is switched on on-demand.

[0257] The controls provide an efficient, user-friendly mechanism togenerate code that is useful in speech interaction applications.Ultimately, time and money is saved during application development.

[0258] Although the present invention has been described with referenceto preferred embodiments, workers skilled in the art will recognize thatchanges may be made in form and detail without departing from the spiritand scope of the invention.

Appendix A

[0259] 1 Introduction

[0260] The following tags are a set of markup elements that allows adocument to use speech as an input or output medium. The tags aredesigned to be self-contained XML that can be imbedded into any SGMLderived markup languages such as HTML, XHTML, cHTML, SMIL, WML and thelike. The tags herein conform generally speech application language tags(SALT). SALT is a developing standard for enabling access toinformation, applications and web services from personal computers,telephones, tablet PCs and wireless mobile devices, for example. TheSALT 1.0 specification may be found online at http://www.SALTforum.org.The tags used herein are similar to SAPI 5.0, which are known methodsavailable from Microsoft Corporation of Redmond, Wash. The tags,elements, events, attributes, properties, return values, etc. are merelyexemplary and should not be considered limiting. Although exemplifiedherein for speech and DTMF recognition, similar tags can be provided forother forms of recognition.

[0261] The main elements herein discussed are: <prompt ...> for speechsynthesis configuration and prompt playing <reco ...> for recognizerconfiguration and recognition execution and post-processing <grammar...> for specifying input grammar resources <bind ...> for processing ofrecognition results <dtmf ...> for configuration and control of DTMF

[0262] 2 Reco

[0263] The Reco element is used to specify possible user inputs and ameans for dealing with the input results. As such, its main elements are<grammar> and <bind>, and it contains resources for configuringrecognizer properties.

[0264] Reco elements are activated programmatically in uplevel browsersvia Start and Stop methods, or in SMIL-enabled browsers by using SMILcommands. They are considered active declaratively in downlevel browsers(i.e. non script-supporting browsers) by their presence on the page. Inorder to permit the activation of multiple grammars in parallel,multiple Reco elements may be considered active simultaneously.

[0265] Recos may also take a partcular mode—‘automatic’, ‘single’ or‘multiple’- to distinguish the kind of recognition scenarios which theyenable and the behaviour of the recognition platform.

[0266] 2.1 Reco Content

[0267] The Reco element contains one or more grammars and optionally aset of bind elements which inspect the results of recognition and copythe relevant portions to values in the containing page.

[0268] In uplevel browsers, Reco supports the programmatic activationand deactivation of individual grammar rules. Note also that alltop-level rules in a grammar are active by default for a recognitioncontext.

[0269] 2.1.1<grammar> Element

[0270] The grammar element is used to specify grammars, either inline orreferenced using the src attribute. At least one grammar (either inlineor referenced) is typically specified. Inline grammars can be text-basedgrammar formats, while referenced grammars can be text-based or binarytype. Multiple grammar elements may be specified. If more than onegrammar element is specified, the rules within grammars are added asextra rules within the same grammar. Any rules with the same name willbe overwritten.

[0271] Attributes:

[0272] src: Optional if inline grammar is specified. URI of the grammarto be included. Note that all top-level rules in a grammar are active bydefault for a recognition context.

[0273] langID: Optional. String indicating which language speech engineshould use. The string format follows the xml:lang definition. Forexample, langID=“en-us” denotes US English. This attribute is onlyeffective when the langID is not specified in the grammar URI. Ifunspecified, defaults to US English.

[0274]  If the langID is specified in multiple places then langIDfollows a precedence order from the lowest scope—remote grammar file(i.e language id is specified within the grammar file) followed bygrammar element followed by reco element. <grammar src=“FromCity.xml” />or <grammar> <rule toplevel=“active”> <p>from </p> <rulerefname=“cities” /> </rule> <rule name=“cities”> <l> <p> Cambridge </p> <p>Seattle </p> <p> London </p> </l> </rule> </grammar>

[0275] If both a src-referenced grammar and an inline grammar arespecified, the inline rules are added to the referenced rules, and anyrules with the same name will be overwritten.

[0276] 2.1.2<bind> Element

[0277] The bind element is used to bind values from the recognitionresults into the page.

[0278] The recognition results consumed by the bind element can be anXML document containing a semantic markup language (SML) for specifyingrecognition results. Its contents include semantic values, actual wordsspoken, and confidence scores. SML could also include alternaterecognition choices (as in an N-best recognition result). A sample SMLdocument for the utterance “I'd like to travel from Seattle to Boston”is illustrated below: <sml confidence=“40”> <travel text=“I'd like totravel from Seattle to Boston”> <origin_city confidence=“45”> Seattle</origin_city> <dest_city  confidence=“35”> Boston </dest_city></travel> </sml>

[0279] Since an in-grammar recognition is assumed to produce an XMLdocument—in semantic markup language, or SML—the values to be bound fromthe SML document are referenced using an XPath query. And since theelements in the page into which the values will be bound should be areuniquely identified (they are likely to be form controls), these targetelements are referenced directly.

[0280] Attributes:

[0281] targetElement: Required. The element to which the value contentfrom the SML will be assigned (as in W3C SMIL 2.0).

[0282] targetAttribute: Optional. The attribute of the target element towhich the value content from the SML will be assigned (as with theattributeName attribute in SMIL 2.0). If unspecified, defaults to“value”.

[0283] test: Optional. An XML Pattern (as in the W3C XML DOMspecification) string indicating the condition under which therecognition result will be assigned. Default condition is true.

[0284] value: Required. An XPATH (as in the W3C XML DOM specification)string that specifies the value from the recognition result document tobe assigned to the target element.

EXAMPLE

[0285] So given the above SML return, the following reco element usesbind to transfer the values in origin_city and dest_city into the targetpage elements txtBoxOrigin and txtBoxDest: <input name=“txtBoxOrigin”type=“text”/> <input name=“txtBoxDest” type=“text” /> <reco id=“travel”><grammar src=“./city.xml” /> <bind targetElement=“txtBoxOrigin”value=“//origin_city” /> <bind targetElement=“txtBoxDest”value=“//dest_city” /> </reco>

[0286] This binding may be conditional, as in the following example,where a test is made on the confidence attribute of the dest_city resultas a pre-condition to the bind operation: <bindtargetElement=“txtBoxDest” value=“//dest_city”test=“/sml/dest_city[@confidence $gt$ 40]” />

[0287] The bind element is a simple declarative means of processingrecognition results on downlevel or uplevel browsers. For more complexprocessing, the reco DOM object supported by uplevel browsers implementsthe onReco event handler to permit programmatic script analysis andpost-processing of the recognition return.

[0288] 2.2 Attributes and Properties

[0289] The following attributes are supported by all browsers, and theproperties by uplevel browsers.

[0290] 2.2.1 Attributes

[0291] The following attributes of Reco are used to configure the speechrecognizer for a dialog turn.

[0292] initialTimeout: Optional. The time in milliseconds between startof recognition and the detection of speech. This value is passed to therecognition platform, and if exceeded, an onsilence event will beprovided from the recognition platform (see 2.4.2). If not specified,the speech platform will use a default value.

[0293] babbleTimeout: Optional. The period of time in milliseconds inwhich the recognizer must return a result after detection of speech. Forrecos in automatic and single mode, this applies to the period betweenspeech detection and the stop call. For recos in ‘multiple’ mode, thistimeout applies to the period between speech detection and eachrecognition return—i.e. the period is restarted after each return ofresults or other event. If exceeded, different events are thrownaccording to whether an error has occurred or not. If the recognizer isstill processing audio—eg in the case of an exceptionally longutterance—the onNoReco event is thrown, with status code 13 (see 2.4.4).If the timeout is exceeded for any other reason, however, a recognizererror is more likely, and the onTimeout event is thrown. If notspecified, the speech platform will default to an internal value.

[0294] maxTimeout: Optional. The period of time in milliseconds betweenrecognition start and results returned to the browser. If exceeded, theonTimeout event is thrown by the browser—this caters for network orrecognizer failure in distributed environments. For recos in ‘multiple’mode, as with babbleTimeout, the period is restarted after the return ofeach recognition or other event. Note that the maxTimeout attributeshould be greater than or equal to the sum of initialTimeout andbabbleTimeout. If not specified, the value will be a browser default.

[0295] endSilence: Optional. For Recos in automatic mode, the period ofsilence in milliseconds after the end of an utterance which must be freeof speech after which the recognition results are returned. Ignored forrecos of modes other than automatic. If unspecified, defaults toplatform internal value.

[0296] reject: Optional. The recognition rejection threshold, belowwhich the platform will throw the ‘no reco’ event. If not specified, thespeech platform will use a default value. Confidence scores rangebetween 0 and 100 (integer). Reject values lie in between.

[0297] server: Optional. URI of speech platform (for use when the taginterpreter and recognition platform are not co-located). An examplevalue might be server=protocol://yourspeechplatform. An applicationwriter is also able to provide speech platform specific settings byadding a querystring to the URI string, egprotocol://yourspeechplatform?bargeinEnergyThreshold=0.5.

[0298] langID: Optional. String indicating which language speech engineshould use. The string format follows the xml:lang definition. Forexample, langID=“en-us” denotes US English. This attribute is onlyeffective when the langID is not specified in the grammar element (see2.1.1).

[0299] mode: Optional. String specifying the recognition mode to befollowed. If unspecified, defaults to “automatic” mode.

[0300] 2.2.2 Properties

[0301] The following properties contain the results returned by therecognition process (these are supported by uplevel browsers).

[0302] recoResult Read-only. The results of recognition, held in an XMLDOM node object containing semantic markup language (SML), as describedin 2.1.2, In case of no recognition, the property returns null.

[0303] text Read-only. A string holding the text of the words recognized(i.e., a shorthand for contents of the text attribute of the highestlevel element in the SML recognition return in recoResult.

[0304] status: Read-only. Status code returned by the recognitionplatform. Possible values are 0 for successful recognition, or thefailure values −1 to −4 (as defined in the exceptions possible on theStart method (section 2.3.1) and Activate method (section 2.3.4)), andstatuses −11 to −15 set on the reception of recognizer events (see 2.4).

[0305] 2.3 Object Methods

[0306] Reco activation and grammar activation may be controlled usingthe following methods in the Reco's DOM object. With these methods,uplevel browsers can start and stop Reco objects, cancel recognitions inprogress, and activate and deactivate individual grammar top-level rules(uplevel browsers only).

[0307] 2.3.1 Start

[0308] The Start method starts the recognition process, using as activegrammars all the top-level rules for the recognition context which havenot been explicitly deactivated.

[0309] Syntax:

[0310] Object.Start( )

[0311] Return value:

[0312] None.

[0313] Exception:

[0314] The method sets a non-zero status code and fires an onNoRecoevent when fails. Possible failures include no grammar (reco status=−1),failure to load a grammar, which could be a variety of reasons likefailure to compile grammar, non-existent URI (reco status==2), or speechplatform errors (reco status=−3).

[0315] 2.3.2 Stop

[0316] The Stop method is a call to end the recognition process. TheReco object stops recording audio, and the recognizer returnsrecognition results on the audio received up to the point whererecording was stopped. All the recognition resources used by Reco arereleased, and its grammars deactivated. (Note that this method need notbe used explicitly for typical recognitions in automatic mode, since therecognizer itself will stop the reco object on endpoint detection afterrecognizing a complete sentence.) If the Reco has not been started, thecall has no effect.

[0317] Syntax:

[0318] Object.Stop( )

[0319] Return value:

[0320] None.

[0321] Exception:

[0322] None.

[0323] 2.3.3 Cancel

[0324] The Cancel method stops the audio feed to the recognizer,deactivates the grammar and releases the recognizer and discards anyrecognition results. The browser will disregard a recognition result forcanceled recognition. If the recognizer has not been started, the callhas no effect.

[0325] Syntax:

[0326] Object.Cancel( )

[0327] Return value:

[0328] None.

[0329] Exception:

[0330] None.

[0331] 2.3.4 Activate

[0332] The Activate method activates a top-level rule in the contextfree grammar (CFG). Activation must be called before recognition begins,since it will have no effect during a ‘Started’ recognition process.Note that all the grammar top-level rules for the recognition contextwhich have not been explicitly deactivated are already treated asactive.

[0333] Syntax:

[0334] Object.Activate(strName);

[0335] Parameters:

[0336] strName: Required. Rule name to be activated.

[0337] Return value:

[0338] None.

[0339] Exception:

[0340] None.

[0341] 2.3.5 Deactivate

[0342] The method deactivates a top-level rule in the grammar. If therule does not exist, the method has no effect.

[0343] Syntax:

[0344] Object.Deactivate(strName);

[0345] Parameters:

[0346] strName: Required. Rule name to be deactivated. An empty stringdeactivates all rules.

[0347] Return value

[0348] None.

[0349] Exception

[0350] None.

[0351] 2.4 Reco Events

[0352] The Reco DOM object supports the following events, whose handlersmay be specified as attributes of the reco element.

[0353] 2.4.1 onReco:

[0354] This event gets fired when the recognizer has a recognitionresult available for the browser. For recos in automatic mode, thisevent stops the recognition process automatically and clears resources(see 2.3.2). OnReco is typically used for programmatic analysis of therecognition result and processing of the result into the page.

[0355] Syntax: Inline HTML <Reco onReco =“handler” > Event propertyObject.onReco = handler; Object.onReco = GetRef(“handler”);

[0356] Event Object Info: Bubbles No To invoke User says somethingDefault Return recognition result object action

[0357] Event Properties:

[0358] Although the event handler does not receive properties directly,the handler can query the event object for data (see the use of theevent object in the example below)

EXAMPLE

[0359] The following XHTML fragment uses onReco to call a script toparse the recognition outcome and assign the values to the properfields. <input name=“txtBoxOrigin” type=“text” /> <inputname=“txtBoxDest” type=“text” /> <reco onReco=“processCityRecognition()”/> <grammar src=“/grammars/cities.xml” /> </reco> <script><![CDATA[function processCityRecognition ( ) { smlResult =event.srcElement.recoResult; origNode =smlResult.selectSingleNode(“//origin_city”); if (origNode != null)txtBoxOrigin.value = origNode.text; destNode =smlResult.selectSingleNode(“//dest_city”); if (destNode != null)txtBoxDest.value = destNode.text; } ]]></script>

[0360] 2.4.2 onSilence:

[0361] onSilence handles the event of no speech detected by therecognition platform before the duration of time specified in theinitialTimeout attribute on the Reco (see 2.2.1). This event cancels therecognition process automatically for the automatic recognition mode.

[0362] Syntax: Inline HTML <reco onSilence=“handler” ...> Event property(in Object.onSilence = handler ECMAScript) Object.onSilence =GetRef(“handler”);

[0363] Event Object Info: Bubbles No To invoke Recognizer did not detectspeech within the period specified in the initialTimeout attribute.Default Set status = −11 action

[0364] Event Properties:

[0365] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0366] 2.4.3 onTimeout

[0367] onTimeout handles two types of event which typically reflecterrors from the speech platform.

[0368] It handles the event thrown by the tags interpreter which signalsthat the period specified in the maxtime attribute (see 2.2.1) expiredbefore recognition was completed. This event will typically reflectproblems that could occur in a distributed architecture.

[0369] It also handles (ii) the event thrown by the speech recognitionplatform when recognition has begun but processing has stopped without arecognition within the period specified by babbleTimeout (see 2.2.1).

[0370] This event cancels the recognition process automatically.

[0371] Syntax: Inline HTML <reco onTimeout=“handler” ...> Event property(in Object.onTimeOut = handler ECMAScript) Object.onTimeOut =GetRef(“handler”);

[0372] Event Object Info: Bubbles No To invoke Thrown by the browserwhen the period set by the maxtime attribute expires before recognitionis stopped. Default Set reco status to −12. action

[0373] Event Properties:

[0374] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0375] 2.4.4 onNoReco: onNoReco is a handler for the event thrown by thespeech recognition platform when it is unable to return validrecognition results. The different cases in which this may happen aredistinguished by status code. The event stops the recognition processautomatically.

[0376] Syntax: Inline HTML <Reco onNoReco =“handler” > Event propertyObject.onNoReco = handler; Object.onNoReco = GetRef(“handler”);

[0377] Event Object Info: Bubbles No To invoke Recognizer detects soundbut is unable to interpret the utterance. Default Set status propertyand return null action recognition result. Status codes are set asfollows: status −13: sound was detected but no speech was able to beinterpreted; status −14: some speech was detected and interpreted butrejected with insufficient confidence (for threshold setting, see thereject attribute in 2.2.1). status −15: speech was detected andinterpreted, but a complete recognition was unable to be returnedbetween the detection of speech and the duration specified in thebabbleTimeout attribute (see 2.2.1).

[0378] Event Properties:

[0379] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0380] 3 Prompt

[0381] The prompt element is used to specify system output. Its contentmay be one or more of the following:

[0382] inline or referenced text, which may be marked up with prosodicor other speech output information;

[0383] variable values retrieved at render time from the containingdocument;

[0384] links to audio files.

[0385] Prompt elements may be interpreted declaratively by downlevelbrowsers (or activated by SMIL commands), or by object methods onuplevel browsers.

[0386] 3.1 Prompt Content

[0387] The prompt element contains the resources for system output,either as text or references to audio files, or both.

[0388] Simple prompts need specify only the text required for output,eg: <prompt id=“Welcome”> Thank you for calling ACME weather report.</prompt>

[0389] This simple text may also contain further markup of any of thekinds described below.

[0390] 3.1.1 Speech Synthesis Markup

[0391] Any format of speech synthesis markup language can be used insidethe prompt element. (This format may be specified in the ‘tts’ attributedescribed in 3.2.1.)

[0392] The following example shows text with an instruction to emphasizecertain words within it: <prompt id=“giveBalance”> You have <emph> fivedollars </emph> left in your account. </prompt>

[0393] 3.1.2 Dynamic Content

[0394] The actual content of the prompt may need to be computed on theclient just before the prompt is output. In order to confirm aparticular value, for example, the value needs to be dereferenced in avariable. The value element may be used for this purpose.

[0395] Value Element

[0396] value: Optional. Retrieves the values of an element in thedocument.

[0397] Attributes:

[0398] targetElement: Optional. Either href or targetElement must bespecified. The id of the element containing the value to be retrieved.

[0399] targetAttribute: Optional. The attribute of the element fromwhich the value will be retrieved.

[0400] href: Optional. The URI of an audio segment. href will overridetargetElement if both are present.

[0401] The targetElement attribute is used to reference an elementwithin the containing document. The content of the element whose id isspecified by targetElement is inserted into the text to be synthesized.If the desired content is held in an attribute of the element, thetargetAttribute attribute may be used to specify the necessary attributeon the targetElement. This is useful for dereferencing the values inHTML form controls, for example. In the following illustration, the“value” attributes of the “txtBoxorigin” and “txtBoxDest” elements areinserted into the text before the prompt is output <prompt id=“Confirm”>Do you want to travel from <value targetElement=“txtBoxOrigin”targetAttribute=“value” /> to <value targetElement=“txtBoxDest”targetAttribute=“value” /> ? </prompt>

[0402] 3.1.3 Audio Files

[0403] The value element may also be used to refer to a pre-recordedaudio file for playing instead of, or within, a synthesized prompt. Thefollowing example plays a beep at the end of the prompt: <prompt> Afterthe beep, please record your message. <value href=“/wav/beep.wav” /></prompt>

[0404] 3.1.4 Referenced Prompts

[0405] Instead of specifying content inline, the src attribute may beused with an empty element to reference external content via URI, as in:<prompt id=“Welcome” src=“/ACMEWeatherPrompts#Welcome” />

[0406] The target of the src attribute can hold any or all of the abovecontent specified for inline prompts.

[0407] 3.2 Attributes and Properties

[0408] The prompt element holds the following attributes (downlevelbrowsers) and properties (downlevel and uplevel browsers).

[0409] 3.2.1 Attributes

[0410] tts: Optional. The markup language type for text-to-speechsynthesis. Default is “SAPI 5”.

[0411] src: Optional if an inline prompt is specified. The URI of areferenced prompt (see 3.1.4).

[0412] bargein: Optional. Integer. The period of time in millisecondsfrom start of prompt to when playback can be interrupted by the humanlistener. Default is infinite, i.e., no bargein is allowed. Bargein=0allows immediate bargein. This applies to whichever kind of barge-in issupported by platform. Either keyword or energy-based bargein times canbe configured in this way, depending on which is enabled at the time thereco is started.

[0413] prefetch: Optional. A Boolean flag indicating whether the promptshould be immediately synthesized and cached at browser when the page isloaded. Default is false.

[0414] 3.2.2 Properties

[0415] Uplevel browsers support the following properties in the prompt'sDOM object.

[0416] bookmark: Read-only. A string object recording the text of thelast synthesis bookmark encountered.

[0417] status: Read-only. Status code returned by the speech platform.

[0418] 3.3 Prompt Methods

[0419] Prompt playing may be controlled using the following methods inthe prompt's DOM object. In this way, uplevel browsers can start andstop prompt objects, pause and resume prompts in progress, and changethe speed and volume of the synthesized speech.

[0420] 3.3.1 Start

[0421] Start playback of the prompt. Unless an argument is given, themethod plays the contents of the object. Only a single prompt object isconsidered ‘started’ at a given time, so if Start is called insuccession, all playbacks are played in sequence.

[0422] Syntax:

[0423] Object.Start([strText]);

[0424] Parameters:

[0425] strText: the text to be sent to the synthesizer. If present, thisargument overrides the contents of the object.

[0426] Return value:

[0427] None.

[0428] Exception:

[0429] Set status=−1 and fires an onComplete event if the audio bufferis already released by the server.

[0430] 3.3.2 Pause

[0431] Pause playback without flushing the audio buffer. This method hasno effect if playback is paused or stopped.

[0432] Syntax:

[0433] Object.Pause( );

[0434] Return value:

[0435] None.

[0436] Exception:

[0437] None.

[0438] 3.3.3 Resume

[0439] Resume playback without flushing the audio buffer. This methodhas no effect if playback has not been paused.

[0440] Syntax:

[0441] Object.Resume( );

[0442] Return value:

[0443] None.

[0444] Exception:

[0445] Throws an exception when resume fails.

[0446] 3.3.4 Stop

[0447] Stop playback, if not already, and flush the audio buffer. If theplayback has already been stopped, the method simply flushes the audiobuffer.

[0448] Syntax:

[0449] Object.Stop( );

[0450] Return value:

[0451] None.

[0452] Exception:

[0453] None.

[0454] 3.3.5 Change

[0455] Change speed and/or volume of playback. Change may be calledduring playback.

[0456] Syntax:

[0457] Object.Change(speed, volume);

[0458] Parameters:

[0459] speed: Required. The factor to change.

[0460] Speed=2.0 means double the current rate,

[0461] speed=0.5 means halve the current rate,

[0462] speed=0 means to restore the default value.

[0463] volume: Required. The factor to change.

[0464] Volume=2.0 means double the current volume,

[0465] volume=0.5 means halve the current volume,

[0466] volume=0 means to restore the default value.

[0467] Return value:

[0468] None.

[0469] Exception:

[0470] None.

[0471] 3.3.6 Prompt Control Example

[0472] The following example shows how control of the prompt using themethods above might be authored for a platform which does not support akeyword barge-in mechanism. <html> <title>Prompt control</title> <head><script> <!-- function checkKWBargein( ) { news.change(1.0, 0.5); //turn down the volume while verifying if (keyword.text == “”) { // resultis below threshold news.change(1.0, 2.0); // restore the volumekeyword.Start( ) ; // restart the recognition } else { news.Stop( ); //keyword detected! Stop the prompt // Do whatever that is necessary } }// </script> <script for=“window” event=“onload”> <!-- news.Start( );keyword.Start( ); //  </script> </head> <body> <prompt id=“news”bargein=“0”>

[0473] Stocks turned in another lackluster performance Wednesday asinvestors received little incentive to make any big moves ahead of nextweek's Federal Reserve meeting. The tech-heavy Nasdaq Composite Indexdropped 42.51 points to close at 2156.26. The Dow Jones IndustrialAverage fell 17.05 points to 10866.46 after an early-afternoon rallyfailed. - <!-- </prompt> <reco id=“keyword” reject=“70”onReco=“checkKWBargein( )” > <grammar src=http://denali/news bargeingrammar.xml /> </reco> </body> </html>

[0474] 3.4 Prompt Events

[0475] The prompt DOM object supports the following events, whosehandlers may be specified as attributes of the prompt element.

[0476] 3.4.1 onBookmark

[0477] Fires when a synthesis bookmark is encountered.

[0478] The event does not pause the playback.

[0479] Syntax: Inline HTML <prompt onBookmark=“handler” ...> Eventproperty Object.onBookmark = handler Object.onBookmark =GetRef(“handler”);

[0480] Event Object Info: Bubbles No To invoke A bookmark in therendered string is encountered Default Returns the bookmark stringaction

[0481] Event Properties:

[0482] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0483] 3.4.2 onBargein:

[0484] Fires when a user's barge-in event is detected. (Note thatdetermining what constitutes a barge-in event, eg energy detection orkeyword recognition, is up to the platform.) A specification of thisevent handler does not automatically turn the barge-in on.

[0485] Syntax: Inline HTML <prompt onBargein=“handler” ...> Eventproperty Object.onBargein = handler Object.onBargein =GetRef(“handler”);

[0486] Event Object Info: Bubbles No To invoke A bargein event isencountered Default None action

[0487] Event Properties:

[0488] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0489] 3.4.3 onComplete:

[0490] Fires when the prompt playback reaches the end or exceptions (asdefined above) are encountered.

[0491] Syntax: Inline HTML <prompt onComplete=“handler” ...> Eventproperty Object. onComplete = handler Object. onComplete =GetRef(“handler”);

[0492] Event Object Info: Bubbles No To invoke A prompt playbackcompletes Default Set status = 0 if playback completes action normally,otherwise set status as specified above.

[0493] Event Properties:

[0494] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0495] 3.4.4 Using Bookmarks and Events

[0496] The following example shows how bookmark events can be used todetermine the semantics of a user response—either a correction to adeparture city or the provision of a destination city—in terms of whenbargein happened during the prompt output. The onBargein handler calls ascript which sets a global ‘mark’ variable to the last bookmarkencountered in the prompt, and the value of this ‘mark’ is used in thereco's postprocessing function (‘heard’) to set the correct value.<script><![CDATA[ var mark; function interrupt( ) { mark =event.srcElement.bookmark; } function ProcessCityConfirm( ) {confirm.stop( ) ; // flush the audio buffer if (mark ==“mark_origin_city”) txtBoxOrigin.value = event.srcElement.text; elsetxtBoxDest.value = event.srcElement.text; } ]]></script> <body> <inputname=“txtBoxOrigin” value=“Seattle” type=“text”/> <inputname=“txtBoxDest” type=“text” /> ... <prompt id=“confirm”onBargein=“interrupt( )” bargein=“0”> From <bookmarkmark=“mark_origin_city” /> <value targetElement=“orgin”targetAttribute=“value” />, please say <bookmark mark=“mark_dest_city”/> the destination city you want to travel to. </prompt> <recoonReco=“ProcessCityConfirm( )” > <grammar src=“ /grm/1033/cities.xml” /></reco> ... </body>

[0497] 4 DTMF

[0498] Creates a DTMF recognition object. The object can be instantiatedusing inline markup language syntax or in scripting. When activated,DTMF can cause prompt object to fire a barge-in event. It should benoted the tags and eventing discussed below with respect to DTMFrecognition and call control discussed in Section 5 generally pertain tointeraction between the voice browser 216 and media server 214.

[0499] 4.1 Content

[0500] dtmfgrammar: for inline grammar.

[0501] bind: assign DTMF conversion result to proper field.

[0502] Attributes:

[0503] targetElement: Required. The element to which a partialrecognition result will be assigned to (cf. same as in W3C SMIL 2.0).

[0504] targetAttribute: the attribute of the target element to which therecognition result will be assigned to (cf. same as in SMIL 2.0).Default is “value”.

[0505] test: condition for the assignment. Default is true.

Example 1 Map Keys to Text

[0506] <input type=“text” name=“city” /> <DTMF id=“city_choice”timeout=“2000” numDigits=“1”> <dtmfgrammar > <keyvalue=“1”>Seattle</key> <key value=“2”>Boston</key> </dtmfgrammar ><bind targetElement=“city” targetAttribute=“value” /> </DTMF> When“city_choice” is activated, “Seattle” will be assigned to the inputfield if the user presses 1, “Boston” if 2, nothing otherwise.

Example 2 How DTMF can be used with Multiple Fields

[0507] <input type=“text” name=“area_code” /> <input type=“text”name=“phone_number” /> <DTMF id=“areacode” numDigits=“3”onReco=“extension.Activate( )”> <bind targetElement=“area_code” /></DTMF> <DTMF id=“extension” numDigits=“7”> <bindtargetElement=“phone_number” /> </DTMF>

[0508] This example demonstrates how to allow users entering intomultiple fields.

Example 3 How to Allow Both Speech and DTMF Inputs and Disable SpeechWhen User Starts Dtmf

[0509] <input type=“text” name=“credit_card_number” /> <promptonBookmark=“dtmf.Start( ); speech.Start( )” bargein=“0”> Please say<bookmark name=“starting” /> or enter your credit card number now</prompt> <DTMF id=“dtmf” escape=“#” length=“16”interdigitTimeout=“2000” onkeypress=“speech.Stop( )”> <bindtargetElement=“credit_card_number” /> </DTMF> <reco id=“speech” ><grammar src=“/grm/1033/digits.xml” /> <bindtargetElement=“credit_card_number” /> </reco>

[0510] 4.2 Attributes and Properties

[0511] 4.2.1 Attributes

[0512] dtmfgrammar: Required. The URI of a DTMF grammar.

[0513] 4.2.2 Properties

[0514] DTMFgrammar Read-Write.

[0515] An XML DOM Node object representing DTMF to string conversionmatrix (also called DTMF grammar). The default grammar is <dtmfgrammar><key value=“0”>0</key> <key value=“1”>1</key> ... <key value=“9”>9</key><key value=“*”>*</key> <key value=“#”>#</key> </dtmfgrammar >

[0516] flush

[0517] Read-write, a Boolean flag indicating whether to automaticallyflush the DTMF buffer on the underlying telephony interface card beforeactivation. Default is false to enable type-ahead.

[0518] escape

[0519] Read-Write. The escape key to end the DTMF reading session.Escape key is one key.

[0520] numDigits

[0521] Read-Write. Number of key strokes to end the DTMF readingsession. If both escape and length are specified, the DTMF session isended when either condition is met.

[0522] dtmfResult

[0523] Read-only string, storing the DTMF keys user has entered. Escapeis included in result if typed. text Read-only string storing whitespace separated token string, where each token is converted according toDTMF grammar.

[0524] initialTimeout

[0525] Read-Write. Timeout period for receiving the first DTMF keystoke,in milliseconds. If unspecified, defaults to the telephony platform'sinternal setting.

[0526] interdigitTimeout

[0527] Read-Write. Timeout period for adjacent DTMF keystokes, inmilliseconds. If unspecified, defaults to the telephony platform'sinternal setting.

[0528] 4.3 Object Methods:

[0529] 4.3.1 Start

[0530] Enable DTMF interruption and start a DTMF reading session.

[0531] Syntax:

[0532] Object.Start( );

[0533] Return value:

[0534] None

[0535] Exception:

[0536] None

[0537] 4.3.2 Stop

[0538] Disable DTMF. The key strokes entered by the user, however,remain in the buffer.

[0539] Syntax:

[0540] Object.Stop( )

[0541] Return value:

[0542] None

[0543] Exception:

[0544] None

[0545] 4.3.3 Flush

[0546] Flush the DTMF buffer. Flush can not be called during a DTMFsession.

[0547] Syntax:

[0548] Object.Flush( );

[0549] Return value:

[0550] None

[0551] Exception:

[0552] None

[0553] 4.4 Events

[0554] 4.4.1 onkeypress

[0555] Fires when a DTMF key is press. This overrides the default eventinherited from the HTML control. When user hits the escape key, theonRec event fires, not onKeypress.

[0556] Syntax: Inline HTML <DTMF onkeypress=“handler” ...> Eventproperty Object.onkeypress = handler Object.onkeypress =GetRef(“handler”);

[0557] Event Object Info: Bubbles No To invoke Press on the touch-tonetelephone key pad Default Returns the key being pressed action

[0558] Event Properties:

[0559] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0560] 4.4.2 onReco

[0561] Fires when a DTMF session is ended. The event disables thecurrent DTMF object automatically.

[0562] Syntax: Inline HTML <DTMF onReco=“handler” ...> Event propertyObject.onReco = handler Object.onReco = GetRef(“handler”);

[0563] Event Object Info: Bubbles No To invoke User presses the escapekey or the number of key strokes meets specified value. Default Returnsthe key being pressed action

[0564] Event Properties:

[0565] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0566] 4.4.3 onTimeout

[0567] Fires when no phrase finish event is received before time out.The event halts the recognition process automatically.

[0568] Syntax: Inline HTML <DTMF onTimeout=“handler” ...> Event property(in Object.onTimeout = handler ECMAScript) Object.onTimeout =GetRef(“handler”);

[0569] Event Object Info: Bubbles No To invoke No DTMF key stroke isdetected within the timeout specified. Default None action

[0570] Event Properties:

[0571] Although the event handler does not receive properties directly,the handler can query the event object for data.

[0572] 5 CallControl Object

[0573] Represents the telephone interface (call, terminal, andconnection) of the telephone voice browser. This object is as native aswindow object in a GUI browser. As such, the lifetime of the telephoneobject is the same as the browser instance itself. A voice browser fortelephony instantiates the telephone object, one for each call. Usersdon't instantiate or dispose the object.

[0574] At this point, only features related to first-party call controlsare exposed through this object.

[0575] 5.1 Properties

[0576] address

[0577] Read-only. XML DOM node object. Implementation specific. This isthe address of the caller. For PSTN, may a combination of ANI and ALI.For VoIP, this is the caller's IP address.

[0578] ringsBeforeAnswer

[0579] Number of rings before answering an incoming call. Default isinfinite, meaning the developer must specifically use the Answer( )method below to answer the phone call. When the call center uses ACD toqueue up the incoming phone calls, this number can be set to 0.

[0580] 5.2 Methods

[0581] Note: all the methods here are synchronous.

[0582] 5.2.1 Transfer

[0583] Transfers the call. For a blind transfer, the system mayterminate the original call and free system resources once the transfercompletes.

[0584] Syntax:

[0585] telephone.Transfer(strText);

[0586] Parameters:

[0587] strText: Required. The address of the intended receiver.

[0588] Return value:

[0589] None.

[0590] Exception:

[0591] Throws an exception when the call transfer fails. e.g., when endparty is busy, no such number, fax or answering machine answers.

[0592] 5.2.2 Bridge

[0593] Third party transfer. After the call is transferred, the browsermay release resources allocated for the call. It is up to theapplication to recover the session state when the transferred callreturns using strUID. The underlying telephony platform may route thereturning call to a different browser. The call can return only when therecipient terminates the call.

[0594] Syntax:

[0595] telephone.Bridge(strText, strUID, [imaxTime]

[0596] Parameters:

[0597] strText: Required. The address of the intended receiver.

[0598] strUID: Required. The session ID uniquely identifying the currentcall. When the transferred call is routed back, the srtUID will appearin the address attribute.

[0599] imaxTime: Optional. Maximum duration in seconds of thetransferred call. If unspecified, defaults to platform-internal value

[0600] Return value:

[0601] None.

[0602] Exception:

[0603] None.

[0604] 5.2.3 Answer

[0605] Answers the phone call.

[0606] Syntax:

[0607] telephone.Answer( );

[0608] Return value:

[0609] None.

[0610] Exception:

[0611] Throws an exception when there is no connection. No onAnswerevent will be fired in this case.

[0612] 5.2.4 Hangup

[0613] Terminates the phone call. Has no effect if no call currently inprogress.

[0614] Syntax:

[0615] telephone.Hangup( );

[0616] Return value:

[0617] None.

[0618] Exception:

[0619] None.

[0620] 5.2.5 Connect

[0621] Starts a first-party outbound phone call.

[0622] Syntax:

[0623] telephone.Connect(strText, [iTimeout]);

[0624] Parameters:

[0625] strText: Required. The address of the intended receiver.

[0626] iTimeout: Optional. The time in milliseconds before abandoningthe attempt. If unspecified, defaults to platform-internal value.

[0627] Return value:

[0628] None.

[0629] Exception:

[0630] Throws an exception when the call cannot be completed, includingencountering busy signals or reaching a FAX or answering machine (Note:hardware may not support this feature).

[0631] 5.2.6 Record

[0632] Record user audio to file.

[0633] Syntax:

[0634] telephone.Record(url, endSilence, [maxTimeout],[initialTimeout]);

[0635] Parameters:

[0636] url: Required. The url of the recorded results.

[0637] endSilence: Required. Time in milliseconds to stop recordingafter silence is detected.

[0638] maxTimeout: Optional. The maximum time in seconds for therecording. Default is platform-specific.

[0639] initialTimeout: Optional. Maximum time (in milliseconds) ofsilence allowed at the beginning of a recording.

[0640] Return value:

[0641] None.

[0642] Exception:

[0643] Throws an exception when the recording can not be written to theurl.

[0644] 5.3 Event Handlers

[0645] App developers using telephone voice browser may implement thefollowing event handlers.

[0646] 5.3.1 onIncoming( )

[0647] Called when the voice browser receives an incoming phone call.All developers can use this handler to read caller's address and invokecustomized features before answering the phone call.

[0648] 5.3.2 onAnswer( )

[0649] Called when the voice browser answers an incoming phone call.

[0650] 5.3.3 onHangup( )

[0651] Called when user hangs up the phone. This event is NOTautomatically fired when the program calls the Hangup or Transfermethods.

5.4 EXAMPLE

[0652] This example shows scripting wired to the call control events tomanipulate the telephony session. <HTML> <HEAD>  <TITLE>LogonPage</TITLE> </HEAD> <SCRIPT> var focus; function RunSpeech( ) { if(logon.user.value == “”) { focus=“user” ; p_uid.Start( ); g_login.Start(); dtmf.Start( ); return; } if (logon.pass.value == “”) { focus=“pin”;p_pin.Start( ); g_login.Start( ); dtmf.Start( ); return; }p_thank.Start( ); logon.submit( ); } function login_reco( ) { res =event.srcElement.recoResult ; pNode = res.selectSingleNode(“//uid”); if(pNode != null) logon.user.value = pNode.xml; pNode =res.selectSingleNode(“//password”); if (pNode != null) logon.pass.value= pNode.xml; } function dtmf_reco( ) { res =event.srcElement.dtmfResult; if (focus == “user”) logon.user.value =res; else logon.pin.value = res; } </SCRIPT> <SCRIPT for=“callControl”event =“onIncoming”> <!-- // read address, prepare customized stuff ifany callControl.Answer( ); // </SCRIPT> <SCRIPT for=“callControl”event=“onOffhook”> <!-- p_main.Start( ); g_login.Start( ); dtmf.Start(); focus=“user”; // </SCRIPT> <SCRIPT for=“window” event=“onload ”> <!--if (logon.user.value != “”) { p_retry.Start( ); logon.user.value = “”;logon.pass.value = “”; checkFields( ); } // </SCRIPT> <BODY> <recoid=“g_login” onReco=“ login_reco( ); runSpeech( )” timeout=“5000”onTimeout=“p_miss.Start( ); RunSpeech( )” > <grammarsrc=http://kokaneel/etradedemo/speechonly/login.xml/> </ reco > <dtmfid=“dtmf” escape=“#” onkeypress=“g_login.Stop( );” onReco=“dtmf_reco( );RunSpeech( )” interdigitTimeout=“5000” onTimeout=“dtmf.Flush( );p_miss.Start( ); RunSpeech( )” /> <prompt id=“p_main”>Please say youruser I D and pin number</prompt> <prompt id=“p_uid”>Please just say youruser I D</prompt> <prompt id=“p_pin”>Please just say your pinnumber</prompt> <prompt id=“p_miss”>Sorry, I missed that</prompt><prompt id=“p_thank”>Thank you. Please wait while I verify youridentity</prompt> <prompt id=“p_retry”>Sorry, your user I D and pinnumber do not match</prompt> <H2>Login</H2> <form id=“logon”> UID:<input name=“user” type=“text” onChange=“runSpeech( )” /> PIN: <inputname=“pass” type=“password” onChange=“RunSpeech( )” /> </form> </BODY></HTML>

[0653] 6 Controlling Dialog Flow

[0654] 6.1 Using HTML and Script to Implement Dialog Flow

[0655] This example shows how to implement a simple dialog flow whichseeks values for input boxes and offers context—sensitive help for theinput. It uses the title attribute on the HTML input mechanisms (used ina visual browser as a “tooltip” mechanism) to help form the content ofthe help prompt. <html> <title>Context Sensitive Help</title> <head><script> var focus; function RunSpeech( ) { if (trade.stock.value == “”){ focus=“trade.stock”; p_stock.Start( ); return; } if (trade.op.value ==“”) { focus=“trade.op”; p_op.Start( ); return; } //.. repeat above forall fields trade.submit( ); } function handle ( ) { res =event.srcElement.recoResult; if (res.text == “help”) { text = “Pleasejust say”; text += document.all[focus].title; p_help.Start(text); } else{ // proceed with value assignments } } </script> </head> <body> <promptid=“p_help” onComplete=“checkFileds( )” /> <prompt id=“p_stock”onComplete=“g_stock.Start( )”>Please say the stock name</prompt> <promptid=“p_op” onComplete=“g_op.Start( )”>Do you want to buy or sell</prompt><prompt id=“p_quantity” onComplete=“g_quantity.Start( )”>How manyshares?</prompt> <prompt id=“p_price” onComplete=“g_price.Start()”>What's the price</prompt> <reco id=“g_stock” onReco=“handle( );checkFields( )” > <grammar src=“./g_stock.xml” /> </ reco > <recoid=“g_op” onReco=“handle( ); checkFields( )” /> <grammarsrc=“./g_op.xml” /> </ reco > <reco id=“g_quantity” onReco=“handle( );checkFields( )” /> <grammar src=“./g_quant.xml” /> </ reco > <recoid=“g_price” onReco=“handle( ); checkFields( )” /> <grammarsrc=“./g_quant.xml” /> </ reco > <form id=“trade”> <input name=“stock”title=“stock name” /> <select name=“op” title=“buy or sell”> <optionvalue=“buy” /> <option value=“sell” /> </select> <input name=“quantity”title=“number of shares” /> <input name=“price” title=“price” /> </form></body> </html>

[0656] 6.2 Using SMIL

[0657] The following example shows activation of prompt and recoelements using SMIL mechanisms. <htmlxmlns:t=“urn:schemas-microsoft-com:time”xmlns:sp=“urn:schemas-microsoft- com:speech”> <head> <style> .time {behavior: url (#default#time2); } </style> </head> <body> <inputname=“txtBoxOrigin” type=“text”/> <input name=“txtBoxDest” type=“text”/> <sp:prompt class=“time” t:begin=“0”> Please say the origin anddestination cities </sp:prompt> <t:par t:begin=“time.end”t:repeatCount=“indefinitely” <sp:reco class=“time” > <grammarsrc=“./city.xml” /> <bind targetElement=“txtBoxOrigin”value=“//origin_city” /> <bind targetElement=“txtBoxDest”test=“/sml/dest_city[@confidence $gt$ 40]” value=“//dest_city” /></sp:reco> </t:par> </body> </html>

Appendix B

[0658] 1 QA Speech Control

[0659] The QA control adds speech functionality to the primary controlto which it is attached. Its object model is an abstraction of thecontent model of the exemplary tags in Appendix A.

[0660] 1.1 QA Control <Speech:QA id=“...” controlsToSpeechEnable=“...”speechlndex=“...” ClientTest=“...” runat=“server” > <Question ...><Statement ...> ... <Answer ...> <Confirm ...> ... <Command ...> ...</Speech:QA>

[0661] 1.1.1 Core Properties

[0662] String ControlsToSpeechEnable

[0663] ControlsToSpeechEnable specifies the list of IDs of the primarycontrols to speech enable. IDs are comma delimited.

[0664] 1.1.2 Activation Mechanisms

[0665] int SpeechIndex

[0666] SpeechIndex specifies the ordering information of the QAcontrol—this is used by RunSpeech. Note: If more than one QA control hasthe same SpeechIndex, RunSpeech will execute them in source order. Insituations where some QA controls have SpeechIndex specified and some QAcontrols do not, RunSpeech will order the QA controls first bySpeechIndex, then by source order.

[0667] string ClientTest

[0668] ClientTest specifies a client-side script function which returnsa boolean value to determine when the QA control is considered availablefor selection by the RunSpeech algorithm. The system strategy cantherefore be changed by using this as a condition to activate orde-activate QA controls more sensitively than SpeechIndex. If notspecified, the QA control is considered available for activation.

[0669] 1.1.3 Questions, Statements, Answers, Confirms and Commands

[0670] Question[] Questions

[0671] QA control contains an array of question objects or controls,defined by the dialog author. Each question control will typicallyrelate to a function of the system, eg asking for a value, etc. Eachquestion control may specify an activation function using the ClientTestattribute, so an active QA control may ask different kinds of questionsabout its primary control under different circumstances. For example,the activation condition for main question Q_Main may be that thecorresponding primary control has no value, and the activation conditionfor a Q_GiveHelp may be that the user has just requested help. EachQuestion may specify answer controlss from within the QA control whichare activated when the question control is outputted.

[0672] Statement[] Statement

[0673] QA control contains an array of statement objects or controls.Statements are used to provide information to the listener, such aswelcome prompts.

[0674] Answer[] Answers

[0675] QA control contains an array of answer objects or controls. Ananswer control is activated directly by a question control within the QAcontrol, or by a StartEvent from the Primary control. Where multipleanswers are used, they will typically reflect answers to the systemfunctions, e.g. A_Main might provide a value in response to Q_Main, andA_Confirm might providing a yes/no+correction to Confirm.

[0676] Confirm[] Confirm

[0677] QA control may contain a confirm object or control. This objectis a mechanism provided to the dialog authors which simplify theauthoring of common confirmation subdialogs.

[0678] Command[] Command

[0679] A Command array holds a set of command controls. Command controlscan be thought of as answer controls without question controls, whosebehavior on recognition can be scoped down the control tree.

[0680] 1.2 Question Control

[0681] The question control is used for the speech output relating to agiven primary control. It contains a set of prompts for presentinginformation or asking a question, and a list of ids of the answercontrols, which may provide an answer to that question. If multipleanswer controls are specified, these grammars are loaded in parallelwhen the question is activated. An exception will be thrown if no answercontrol is specified in the question control. <Question id=“...”ClientTest=“...” Answers=“...” Count=“...” initialTimeout=“...”babbleTimeout=“...” maxTimeout=“...” Modal=“...” PromptFunction=“...”OnClientNoReco=“...” > <prompt ... /> ... </Question>

[0682] string ClientTest

[0683] ClientTest specifies the client-side script function returning aboolean value which determines under which circumstances a questioncontrol is considered active within its QA control (the QA controlitself must be active for the question to be evaluated). For a given QAcontrol, the first question control with a true condition is selectedfor output. For example, the function may be used to determine whetherto output a question which asks for a value (“Which city do you want?”)or which attempts to confirm it (“Did you say London?”). If notspecified, the question condition is considered true.

[0684] Prompt[] Prompts

[0685] The prompt array specifies a list of prompt objects, discussedbelow. Prompts are also able to specify conditions of selection (viaclient functions), and during RunSpeech execution only the first promptwith a true condition is selected for playback.

[0686] string Answers

[0687] Answers is an array of references by ID to controls that arepossible answers to the question. The behavior is to activate thegrammar from each valid answer control in response to the prompt askedby the question control.

[0688] Integer initialTimeout

[0689] The time in milliseconds between start of recognition and thedetection of speech. This value is passed to the recognition platform,and if exceeded, an onSilence event will be thrown from the recognitionplatform. If not specified, the speech platform will use a defaultvalue.

[0690] Integer BabbleTimeout

[0691] The period of time in milliseconds in which the recognitionserver or other recognizer must return a result after detection ofspeech. For recos in “tap-and-talk” scenarios this applies to the periodbetween speech detection and the recognition result becoming available.For recos in dictation scenarios, this timeout applies to the periodbetween speech detection and each recognition return—i.e. the period isrestarted after each return of results or other event. If exceeded, theonClientNoReco event is thrown but different status codes are possible.If there has been any kind of recognition platform error that isdetectable and the babbleTimeout period has elapsed, then anonClientNoReco is thrown but with a status code −3. Otherwise if therecognizer is still processing audio—e.g. in the case of anexceptionally long utterance or if the user has kept the pen down for anexcessive amount of time−the onClientNoReco event is thrown, with statuscode −15. If babbleTimeout is not specified, the speech platform willdefault to an internal value.

[0692] Integer MaxTimeout

[0693] The period of time in milliseconds between recognition start andresults returned to the client device browser. If exceeded, theonMaxTimeout event is thrown by the browser—this caters for network orrecognizer failure in distributed environments. For recos in dictationscenarios, as with babbleTimeout, the period is restarted after thereturn of each recognition or other event. Note that the maxTimeoutattribute should be greater than or equal to the sum of initialTimeoutand babbleTimeout. If not specified, the value will be a browserdefault.

[0694] Bool Modal

[0695] When modal is set to true, no answers except the immediate set ofanswers to the question are activated (i.e. no scoped Answers areconsidered). The defaults is false. For Example, this attribute allowsthe application developer to force the user of the client device toanswer a particular question.

[0696] string PromptFunction(prompt)

[0697] PromptFunction specifies a client-side function that will becalled once the question has been selected but before the prompt isplayed. This gives a chance to the application developer to perform lastminute modifications to the prompt that may be required. PromptFunctiontakes the ID of the target prompt as a required parameter.

[0698] string OnClientNoReco

[0699] OnClientNoReco specifies the name of the client-side function tocall when the NoReco (mumble) event is received.

[0700] 1.2.1 Prompt Object

[0701] The prompt object contains information on how to play prompts.All the properties defined are read/write properties. <prompt id=“...”count=“...” ClientTest=“...” source=“...” bargeIn=“...”onCllentBargein=“...” onClientComplete=“...” onClientBookmark=“...” > .. .text/markup of the prompt. . . </prompt>

[0702] Int Count

[0703] Count specifies an integer which is used for prompt selection.When the value of the count specified on a prompt matches the value ofthe count of its question control, the prompt is selected for playback.Legal values are 0-100. <Question id=Q_Ask”> <prompt count=“1”> Hello</prompt> <prompt count=“2”> Hello again </prompt> </Question>

[0704] In the example, when Q_Ask.count is equal to 1, the first promptis played, and if it is equal to 2 (i.e. the question has already beenoutput before), the second prompt is then played.

[0705] string ClientTest

[0706] ClientTest specifies the client-side script function returning aboolean value which determines under which circumstances a prompt withinan active question control will be selected for output. For a givenquestion control, the first prompt with a true condition is selected.For example, the function may be used to implement prompt tapering, eg(“Which city would you like to depart from?” for a function returningtrue if the user if a first-timer, or “Which city?” for an old hand). Ifnot specified, the prompt's condition is considered true.

[0707] string InlinePrompt

[0708] The prompt property contains the text of the prompt to play. Thisis defined as the content of the prompt element. It may contain furthermarkup, as in TTS rendering information, or <value> elements. As withall parts of the page, it may also be specified as script code within<script> tags, for dynamic rendering of prompt output.

[0709] string Source

[0710] Source specifies the URL from which to retrieve the text of theprompt to play. If an inline prompt is specified, this property isignored.

[0711] Bool BargeIn

[0712] BargeIn is used to specify whether or not barge-in (wherein theuser of the client device begins speaking when a prompt is being played)is allowed on the prompt. The defaults is true.

[0713] string onClientBargein

[0714] onClientBargein specifies the client-side script function whichis invoked by the bargein event.

[0715] string onClientComplete

[0716] onClientComplete specifies the client-side script function whichis invoked when the playing of the prompt has competed.

[0717] string OnClientBookmark

[0718] OnClientBookmark accesses the name of the client-side function tocall when a bookmark is encountered.

[0719] 1.2.2 Prompt Selection

[0720] On execution by RunSpeech, a QA control selects its prompt in thefollowing way:

[0721] ClientTest and the count attribute of each prompt are evaluatedin order. The first prompt with both ClientTest and count true isplayed. A missing count is considered true. A missing ClientTest isconsidered true.

[0722] 1.3 Statement Control

[0723] Statement controls are used for information-giving system outputwhen the activation of grammars is not required. This is common invoice-only dialogs. Statements are played only once per page if theplayOnce attribute is true. <Statement id=“...” playOnce=“...”ClientTest=“...” PromptFunction=“...” > <prompt ... /> ... </Statement >

[0724] Bool playOnce

[0725] The playOnce attribute specifies whether or not a statementcontrol may be activated more than once per page. playOnce is a Booleanattribute with a default (if not specified) of TRUE, i.e., the statementcontrol is executed only once. For example, the playonce attribute maybe used on statement controls whose purpose is to output email messagesto the end user. Setting playOnce=“False” will provide dialog authorswith the capability to enable a “repeat” functionality on a page thatreads email messages.

[0726] string ClientTest

[0727] ClientTest specifies the client-side script function returning aboolean value which determines under which circumstances a statementcontrol will be selected for output. RunSpeech will activate the firstStatement with ClientTest equal to true. If not specified, theClientTest condition is considered true.

[0728] String PromptFunction

[0729] PromptFunction specifies a client-side function that will becalled once the statement control has been selected but before theprompt is played. This gives a chance to the authors to do last minutemodifications to the prompt that may be required.

[0730] Prompt[] Prompt

[0731] The prompt array specifies a list of prompt objects. Prompts arealso able to specify conditions of selection (via client functions), andduring RunSpeech execution only the first prompt with a true conditionis selected for playback. <Speech:QA id=“QA_Welcome”ControlsToSpeechEnable=“Label1” runat=“server” > <Statementid=“WelcomePrompt” > <prompt bargeIn=“False”> Welcome </prompt></Statement> </Speech:QA>

[0732] 1.4 Confirm Control

[0733] Confirm controls are special types of question controls. They mayhold all the properties and objects of other questions controls, butthey are activated differently. The RunSpeech algorithm will check theconfidence score found in the confirmThreshold of the answer control ofthe ControlsToSpeechEnable. If it is too low, the confirm control isactivated. If the confidence score of the answer control is below theconfirmThreshold, then the binding is done but the onClientReco methodis not called. The dialog author may specify more than one confirmcontrol per QA control. RunSpeech will determine which confirm controlto activate based on the function specified by ClientTest. <AnswerConfirmThreshold=... /> <Confirm> ...all attributes and objects ofQuestion... </Confirm>

[0734] 1.5 Answer Control

[0735] The answer control is used to specify speech input resources andfeatures. It contains a set of grammars related to the primary control.Note that an answer may be used independently of a question, inmultimodal applications without prompts, for example, or in telephonyapplications where user initiative may be enabled by extra-answers.Answer controls are activated directly by question controls, by atriggering event, or by virtue of explicit scope. An exception will bethrown if no grammar object is specified in the answer control. <Answerid=“...” scope=“...” StartEvent=“...” StopEvent=“...” ClientTest=“...”onClientReco=“...” onClientDTMF=“...” autobind=“...” server=“...”ConfirmThreshold=“...” RejectThreshold=“...” > <grammar ... /> <grammar... /> ... <dtmf ... /> <dtmf ... /> ... <bind ... /> <bind ... /> ...</Answer>

[0736] string Scope

[0737] Scope holds the id of any named element on the page. Scope isused in answer control for scoping the availability of user initiative(mixed task initiative: i.e. service jump digressions) grammars. Ifscope is specified in an answer control, then it will be activatedwhenever a QA control corresponding to a primary control within thesubtree of the contextual control is activated.

[0738] string StartEvent

[0739] StartEvent specifies the name of the event from the primarycontrol that will activate the answer control (start the Reco object).This will be typically used in multi-modal applications, eg onMouseDown,for tap-and-talk.

[0740] string StopEvent

[0741] StopEvent specifies the name of the event from the primarycontrol that will de-activate the answer control (stop the Reco object).This will be typically used in multi-modal applications, eg onMouseUp,for tap-and-talk.

[0742] string ClientTest

[0743] ClientTest specifies the client-side script function returning aboolean value which determines under which circumstances an answercontrol otherwise selected by scope or by a question control will beconsidered active. For example, the test could be used duringconfirmation for a ‘correction’ answer control to disable itself whenactivated by a question control, but mixed initiative is not desired(leaving only accept/deny answers controls active). Or a scoped answercontrol which permits a service jump can determine more flexible meansof activation by specifying a test which is true or false depending onanother part of the dialog. If not specified, the answer control'scondition is considered true.

[0744] Grammar[] Grammars

[0745] Grammars accesses a list of grammar objects.

[0746] DTMF[] DTMFs

[0747] DTMFs holds an array of DTMF objects.

[0748] Bind[] Binds

[0749] Binds holds a list of the bind objects necessary to map theanswer control grammar results (dtmf or spoken) into control values. Allbinds specified for an answer will be executed when the relevant outputis recognized. If no bind is specified, the SML output returned byrecognition will be bound to the control specified in theControlsToSpeechEnable of the QA control

[0750] string OnClientReco

[0751] OnClientReco specifies the name of the client-side function tocall when spoken recognition results become available.

[0752] string OnClientDTMF

[0753] OnClientDTMF holds the name of the client-side function to callwhen DTMF recognition results become available.

[0754] Boolean Autobind

[0755] The value of autobind determines whether or not the systemdefault bindings are implemented for a recognition return from theanswer control. If unspecified, the default is true. Setting autobind tofalse is an instruction to the system not to perform the automaticbinding.

[0756] string Server

[0757] The server attribute is an optional attribute specifying the URIof the speech server to perform the recognition. This attributeover-rides the URI of the global speech server attribute.

[0758] integer ConfirmThreshold

[0759] Holds a value representing the confidence level below which aconfirm control question will be automatically triggered immediatelyafter an answer is recognized within the QA control. Legal values are0-100.

[0760] Note that where bind statements and onClientReco scripts are bothspecified, the semantics of the resulting Tags are that binds areimplemented before the script specified in on ClientReco.

[0761] integer RejectThreshold

[0762] RejectThreshold specifies the minimum confidence score toconsider returning a recognized utterance. If overall confidence isbelow this level, a NoReco event will be thrown. Legal values are 0-100.

[0763] 1.5.1 Grammar

[0764] The grammar object contains information on the selection andcontent of grammars, and the means for processing recognition results.All the properties defined are read/write properties. <GrammarClientTest=“...” Source=“...” > ...grammar rules... </Grammar>

[0765] string ClientTest

[0766] The ClientTest property references a client-side boolean functionwhich determines under which conditions a grammar is active. If multiplegrammars are specified within an answer control (e.g. to implement asystem/mixed initiative strategy, or to reduce the perplexity ofpossible answers when the dialog is going badly), only the first grammarwith a true ClientTest function will be selected for activation duringRunSpeech execution. If this property is unspecified, true is assumed.

[0767] string Source

[0768] Source accesses the URI of the grammar to load, if specified.

[0769] string InlineGrammar

[0770] InlineGrammar accesses the text of the grammar if specifiedinline. If that property is not empty, the Source attribute is ignored.

[0771] 1.5.2 Bind

[0772] The object model for bind follows closely its counterpart clientside tags. Binds may be specified both for spoken grammar and for DTMFrecognition returns in a single answer control. <bind Value=“...”TargetElement=“...” TargetAttribute=“...” Test=“...” />

[0773] string Value

[0774] Value specifies the text that will be bound into the targetelement. It is specified as an XPath on the SML output from recognition.

[0775] string TargetElement

[0776] TargetElement specifies the id of the primary control to whichthe bind statement applies. If not specified, this is assumed to be theControlsToSpeechEnable of the relevant QA control.

[0777] string TargetAttribute

[0778] TargetAttribute specifies the attribute on the TargetElementcontrol in which bind the value. If not specified, this is assumed to bethe Text property of the target element.

[0779] string Test

[0780] The Test attribute specifies a condition which must evaluate totrue on the binding mechanism. This is specified as an XML Pattern onthe SML output from recognition.

[0781] 1.5.2.1 Automatic Binding

[0782] The default behavior on the recognition return to aspeech-enabled primary control is to bind certain properties into thatprimary control. This is useful for the dialog controls to examine therecognition results from the primary controls across turns (and evenpages). Answer controls will perform the following actions uponreceiving recognition results:

[0783] 1. bind the SML output tree into the SML attribute of the primarycontrol

[0784] 2. bind the text of the utterance into the SpokenText attributeof the primary control

[0785] 3. bind the confidence score returned by the recognizer into theConfidence attribute of the primary control.

[0786] Unless autobind=“False” attribute is specified on an answercontrol, the answer control will perform the following actions on theprimary control:

[0787] 1. bind the SML output tree into the SML attribute;

[0788] 2. bind the text of the utterance into the SpokenText attribute;

[0789] 3. bind the confidence score returned by the recognizer into theConfidence attribute;

[0790] Any values already held in the attribute will be overwritten.Automatic binding occurs before any author-specified bind commands, andhence before any onClientReco script (which may also bind to theseproperties).

[0791] 1.5.3 DTMF

[0792] DTMF may be used by answer controls in telephony applications.The DTMF object essentially applies a different modality of grammar (akeypad input grammar rather than a speech input grammar) to the sameanswer. The DTMF content model closely matches that of the client sideoutput Tags DTMF element. Binding mechanisms for DTMF returns arespecified using the targetAttribute attribute of DTMF object. <DTMFfirstTimeOut=“...” interDigitTimeOut=“...” numDigits=“...” flush=“...”escape=“...” targetAttribute=“...” ClientTest=“...”> <dtmfGrammar ...></DTMF>

[0793] integer firstTimeOut

[0794] The number of milliseconds to wait between activation and thefirst key press before raising a timeout event.

[0795] integer interDigitTimeOut

[0796] The number of milliseconds to wait between key presses beforeraising a timeout event.

[0797] int numDigits

[0798] The maximum number of key inputs permitted during DTMFrecognition.

[0799] Bool Flush

[0800] A flag which states whether or not to flush the telephonyserver's DTMF buffer before recognition begins. Setting flush to falsepermits DTMF key input to be stored between recognition/page calls,which permits the user to ‘type-ahead’.

[0801] string Escape

[0802] Holds the string value of the key which will be used to end DTMFrecognition (eg ‘#’).

[0803] string targetAttribute

[0804] TargetAttribute specifies the property on the primary control inwhich to bind the value. If not specified, this is assumed to be theText property of the primary control.

[0805] string ClientTest

[0806] The ClientTest property references a client-side boolean functionwhich determines under which conditions a DTMF grammar is active. Ifmultiple grammars are specified within a DTMF object, only the firstgrammar with a true ClientTest function will be selected for activationduring RunSpeech execution. If this property is unspecified, true isassumed.

[0807] 1.5.4 DTMFGrammar

[0808] DTMFGrammar maps a key to an output value associated with thekey. The following sample shows how to map the “1” and “2” keys to textoutput values. <dtmfgrammar> <key value=“1”>Seattle</key> <keyvalue=“2”>Boston</key> </dtmfgrammar>

[0809] 1.6 Command Control

[0810] The command control is a special variation of answer controlwhich can be defined in any QA control. Command controls are forms ofuser input which are not answers to the question at hand (eg, Help,Repeat, Cancel), and which do not need to bind recognition results intoprimary controls. If the QA control specifies an activation scope, thecommand grammar is active for every QA control within that scope. Hencea command does not need to be activated directly by a question controlor an event, and its grammars are activated in parallel independently ofanswer controls building process. Command controls of the same type atQA controls lower in scope can override superior commands withcontext-sensitive behavior (and even different/extended grammars ifnecessary). <Command id=“...” scope=“...” type=“...”RejectThreshold=“...” onClientReco=“...” > <Grammar ...> <dtmf ... > ...</Command>

[0811] string Scope

[0812] Scope holds the id of a primary control. Scope is used in commandcontrols for scoping the availability of the command grammars. If scopeis specified for a command control, the command's grammars will beactivated whenever a QA control corresponding to a primary controlwithin the subtree of the contextual control is activated.

[0813] string Type

[0814] Type specifies the type of command (eg ‘help’, ‘cancel’ etc.) inorder to allow the overriding of identically typed commands at lowerlevels of the scope tree. Any string value is possible in thisattribute, so it is up to the author to ensure that types are usedcorrectly.

[0815] integer RejectThreshold

[0816] RejectThreshold specifies the minimum confidence level ofrecognition that is necessary to trigger the command in recognition(this is likely to be used when higher than usual confidence isrequired, eg before executing the result of a ‘Cancel’ command). Legalvalues are 0-100.

[0817] string onClientReco

[0818] onCommand specifies the client-side script function to execute onrecognition of the command control's grammar.

[0819] Grammar Grammar

[0820] The grammar object which will listen for the command.

[0821] Dtmf Dtmf

[0822] The dtmf object which will activate the command.

[0823] 2 Types of Initiatives and Dialog Flows

[0824] Using the control described above, various forms of initiativescan be developed, some examples are provided below:

[0825] 2.1 Mixed initiative Dialogs

[0826] Mixed initiative dialogs provide the capability of acceptinginput for multiple controls with the asking of a single question. Forexample, the answer to the question “what are your travel plans” mayprovide values for an origin city textbox control, a destination citytextbox control and a calendar control (“Fly from Puyallup to Yakima onSeptember 30^(th)”).

[0827] A robust way to encode mixed initiative dialogs is to handwritethe mixed initiative grammar and relevant binding statements, and applythese to a single control.

[0828] The following example shows a single page used for a simple mixedinitiative voice interaction about travel. The first QA controlspecifies the mixed initiative grammar and binding, and a relevantprompt asking for two items. The second and third QA controls are notmixed initiative, and so bind directly to their respective primarycontrol by default (so no bind statements are required). The RunSpeechalgorithm will select the QA controls based on an attribure“SpeechIndex” and whether or not their primary controls hold validvalues. <%@ Page language=“c#” AutoEventWireup=“false”inherits=“SDN.Page” %> <%@ Register tagPrefix=“SDN” Namespace=“SDN”Assembly=“SDN” %> <html> <body> <Form id=“WebForm1” method=postrunat=“server”> <ASP:Label id=“Label1” runat=“server”>Departurecity</ASP:Label> <ASP:TextBox id=“TextBox1” runat=“server” /> <br><ASP:Label id=“Label2” runat=“server”>Arrival city</ASP:Label><ASP:TextBoxid=“TextBox2” textchanged=“TextChanged” runat=“server” /><!-speech information --> <Speech:QA id=“QAmixed”controlsToSpeechEnable=“TextBox1” speechIndex=“1” runat=“server”><Question id=“Q1” Answers=“A1”> <prompt>“Please say the cities you wantto fly from and to”</prompt> </Question> <Answer id=“A1” > <grammarsrc=“...”/> <bind targetElement=“TextBox1” value=“/sml/path1”/> <bindtargetElement=“TextBox2” value=“/sml/path2”/> </Answer> </Speech:QA><Speech:QA id=“QA1” controlsToSpeechEnable=“TextBox1” speechIndex=“2”runat=“server”> <Question id=“Q1”Answers=“A1”> <prompt>“What's thedeparture city?”</prompt> </Question> <Answer id=“A1” > <grammarsrc=“...”/> </Answer> </Speech:QA> <Speech:QA id=“QA2”controlsToSpeechEnable=“TextBox2” speechIndex=“3” runat=“server”><Question id=“Q1”Answer=“A1”> <prompt>“What's the arrival city”</prompt></Question> <Answer id=“A1”> <grammar src=“...”/> </Answer> </Speech:QA></Form> </body> </html>

[0829] 2.2 Complex Mixed Initiative

[0830] Application developers can specify several answer to the samequestion control with different levels of initiatives. Conditions arespecified that will select one of the answers when the question isasked, depending on the initiative settings that they require. Anexample is provided below: <Speech:QA id=“QA_Panel2”ControlsToSpeechEnable=“Panel2” runat=“server” > <Questionanswers=“systemInitiative, mixedInitiative” .../> <Answerid=“systemlnitiative” ClientTest=“systemInitiativeCond”onClientReco=“SimpleUpdate” > <grammar src=“systetnlnitiative.gram” /></Answer> <Answer id=“mixedlnitiative” ClientTest=“mixedInitiativeCond”onClientReco=“MixedUpdate” > <grammar src=“mixedInitiative.gram” /></Answer> </Speech:QA>

[0831] Application developers can also specify several question controlsin a QA control. Some question controls can allow a mixed initiativestyle of answer, whilst others are more directed. By authoringconditions on these question controls, application developer can selectbetween the questions depending on the dialogue situation.

[0832] In the following example the mixed initiative question asks thevalue of the two textboxes at the same time (e.g., ‘what are your travelplans?’) and calls the mixed initiative answer (e.g., ‘from London toSeattle’). If this fails, then the value of each textbox is askedseparately (e.g., ‘where do you leave from’ and ‘where are you goingto’) but, depending on the conditions, the mixed-initiative grammar maystill be activated, thus allowing users to provide both values.<Speech:QA id=“QA_Panel2” ControlsToSpeechEnable=“TextBox1, TextBox2”runat=“server” > <Question ClientTest=“AllEmpty( )” answers=“AnsAll”.../> <Question ClientTest=“TextBox1IsEmpty( )” answers=“AnsAll,AnsTextBox1” .../> <Question ClientTest=“TextBox2IsEmpty( )”answers=“AnsAll, AnsTextBox2” .../> <Answer id=“AnsTextBox1”onClientReco=“SimpleUpdate”> <grammar src=“AnsTextBox1.gram” /></Answer> <Answer id=“AnsTextBox2” onClientReco=“SimpleUpdate” ><grammar src=“ AnsTextBox2.gram” /> </Answer> <Answer id=“AnsAll”ClientTest=“IsMixedInitAllowed( )” onClientReco=“MixedUpdate” > <grammarsrc=“AnsAll.gram” /> </Answer> </Speech:QA>

[0833] 2.3 User Initiative

[0834] Similar to the command control, a standard QA control can specifya scope for the activation of its grammars. Like a command control, thisQA control will activate the grammar from a relevant answer controlwhenever another QA control is activated within the scope of thiscontext. Note that its question control will only be asked if the QAcontrol itself is activated. <Speech:QA id=“QA_Panel2”ControlsToSpeechEnable=“Panel2” runat=“server” > <Question ... /><Answer id=“AnswerPanel2” scope=“Panel2” onClientReco=“UpdatePanel2()” > <grammar src=“Panel2.gram” /> </Answer> </Speech:QA>

[0835] This is useful for dialogs which allow ‘service jumping’-userresponses about some part of the dialog which is not directly related tothe question control at hand.

[0836] 2.4 Short Time-Out Confirms

[0837] Application developers can write a confirmation as usual but seta short time-out. In the timeout handler, code is provided to thataccept the current value as exact. <Speech:QA id=“QA_Panel2”ControlsToSpeechEnable=“Panel2” runat=“server” > <Confirm timeOut=“20”onClientTimeOut=“AcceptConfirmation”... /> <Answer id=“CorrectPanel2”onClientReco=“UpdatePanel2( )” > <grammar src=“Panel2.gram” /> </Answer></Speech:QA>

[0838] 2.5 Dynamic Prompt Building and Editing

[0839] The promptFunction script is called after a question control isselected but before a prompt is chosen and played. This lets applicationdevelopers build or modify the prompt at the last minute. In the examplebelow, this is used to change the prompt depending on the level ofexperience of the users. <script language=javascript> functionGetPrompt( ) { if(experiencedUser==true) Prompt1.Text = “What service doyou want?”; else Prompt1.Text = “Please choose between e-mail, calendarand news”; return; } </script> <Speech:QA id=“QA_Panel2”ControlsToSpeechEnable=“Panel2” runat=“server” > <QuestionPromptFunction=“GetPrompt”... > <Prompt id=“Prompt1” /> </Question><Answer ... /> </Speech:QA>

[0840] 2.6 Using Semantic Relationships

[0841] Recognition and use of semantic relationships can be done bystudying the result of the recognizer inside the onReco event handler.<script language=“javascript”> function Reco( ) { /*

[0842] Application developers can access the SML returned by therecogniser or recognition server. If a semantic relationship (likesport-news) is identified, the confidence of the individual elements canbe increased or take any other appropriate action. */ } </script><Speech:QA id=“QA_Panel2” ControlsToSpeechEnable=“Panel2”runat=“server” > <Question ... /> <Answer onClientReco=“Reco” > <grammarsrc=“Panel2.gram” /> </Answer> </Speech:QA>

[0843] 3 Implementation and Application of RunSpeech

[0844] A mechanism is needed to provide voice-only clients with theinformation necessary to properly render speech-enabled pages. Such amechanism must provide the execution of dialog logic and maintain stateof user prompting and grammar activation as specified by the applicationdeveloper.

[0845] Such a mechanism is not needed for multimodal clients. In themultimodal case, the page containing speech-enabled controls is visibleto the user of the client device. The user of the client device mayprovide speech input into any visible speech-enabled control in anydesired order using the a multimodal paradigm.

[0846] The mechanism used by voice-only clients to render speech-enabledpages is the RunSpeech script or algorithm. The RunSpeech script reliesupon the SpeechIndex attribute of the QA control and the SpeechGroupcontrol discussed below.

[0847] 3.1 SpeechControl

[0848] During run time, the system parses a control script or webpagehaving the server controls and creates a tree structure of servercontrols. Normally the root of the tree is the Page control. If thecontrol script uses custom or user control, the children tree of thiscustom or user control is expanded. Every node in the tree has an ID andit is easy to have name conflict in the tree when it expands. To dealwith possible name conflict, the system includes a concept ofNamingContainer. Any node in the tree can implement NamingContainer andits children lives within that name space.

[0849] The QA controls can appear anywhere in the server control tree.In order to easily deal with SpeechIndex and manage client siderendering, a SpeechGroup control is provided. The Speechgroup control ishidden from application developer.

[0850] One SpeechGroup control is created and logically attached toevery NamingContainer node that contain QA controls in its childrentree. QA and SpeechGroup controls are considered members of its directNamingContainer's SpeechGroup. The top level SpeechGroup control isattached to the Page object. This membership logically constructs atree—a logical speech tree—of QA controls and SpeechGroup controls.

[0851] For simple speech-enabled pages or script (i.e., pages that donot contain other NamingContainers), only the root SpeechGroup controlis generated and placed in the page's server control tree before thepage is sent to the voice-only client. The SpeechGroup control maintainsinformation regarding the number and rendering order of QA controls onthe page.

[0852] For pages containing a combination of QA control(s) andNamingContainer(s), multiple SpeechGroup controls are generated: oneSpeechGroup control for the page (as described above) and a SpeechGroupcontrol for each NamingContainer. For a page containingNamingContainers, the page-level SpeechGroup control maintains QAcontrol information as described above as well as number and renderingorder of composite controls. The SpeechGroup control associated witheach NamingContainer maintains the number and rendering order of QAswithin each composite.

[0853] The main job of the SpeechGroup control is to maintain the listof QA controls and SpeechGroups on each page and/or the list of QAcontrols comprising a composite control. When the client side markupscript (e.g. HTML) is generated, each SpeechGroup writes out aQACollection object on the client side. A QACollection has a list of QAcontrols and QACollections. This corresponds to the logical server sidespeech tree. The RunSpeech script will query the page-level QACollectionobject for the next QA control to invoke during voice-only dialogprocessing.

[0854] The page level SpeechGroup control located on each page is alsoresponsible for:

[0855] Determining that the requesting client is a voice-only client;and

[0856] Generating common script and supporting structures for all QAcontrols on each page.

[0857] When the first SpeechGroup control renders, it queries theSystem.Web.UI.Page.Request.Browser property for the browser string. Thisproperty is then passed to the RenderSpeechHTML and RenderSpeechScriptmethods for each QA control on the page. The QA control will then renderfor the appropriate client(multimodal or voice-only).

[0858] 3.2 Creation of SpeechGroup Controls

[0859] During server-side page loading, the onLoad event is sent to eachcontrol on the page. The page-level SpeechGroup control is created bythe first QA control receiving the on Load event. The creation ofSpeechGroup controls is done in the following manner: (assume a pagecontaining composite controls)

[0860] Every QA control will receive onLoad event from run time code.onLoad for a QA:

[0861] Get the QA's NamingContainer N1

[0862] Search for SpeechGroup in the N1's children

[0863] If already exists, register QA control with this SpeechGroup.onLoad returns.

[0864] If not found:

[0865] Create a new SpeechGroup G1, inserts it into the N1's children

[0866] If N1 is not Page, find N1's NamingContainer N2

[0867] Search for SpeechGroup in N2's children, if exists, say G2, addG1 to G2. If not, create a new one G2, inserts in to N2's children

[0868] Recursion until the NamingContainer is the Page (top level)

[0869] During server-side page rendering, the Render event is sent tothe speech-enabled page. When the page-level SpeechGroup controlreceives the Render event, it generates client side script to includeRunSpeech.js and inserts it into the page that is eventually sent to theclient device. It also calls all its direct children to render speechrelated HTML and scripts. If a child is SpeechGroup, the child in turncalls its children again. In this manner, the server rendering happensalong the server side logical speech tree.

[0870] When a SpeechGroup renders, it lets its children (which can beeither QA or SpeechGroup) render speech HTML and scripts in the order oftheir SpeechIndex. But a SpeechGroup is hidden and doesn't naturallyhave a SpeechIndex. In fact, a SpeechGroup will have the sameSpeechIndex as its NamingContainer, the one it attaches to. TheNamingContainer is usually a UserControl or other visible control, andan author can set SpeechIndex to it.

[0871] 3.3 RunSpeech

[0872] The purpose of RunSpeech is to permit dialog flow via logic whichis specified in script or logic on the client. In one embodiment,RunSpeech is specified in an external script file, and loaded by asingle line generated by the server-side rendering of the SpeechGroupcontrol, e.g.: <script language=“javascript” src=“/scripts/RunSpeech.js”/>

[0873] The RunSpeech.js script file should expose a means for validatingon the client that the script has loaded correctly and has the rightversion id, etc. The actual validation script will be automaticallygenerated by the page class as inline functions that are executed afterthe attempt to load the file.

[0874] Linking to an external script is functionally equivalent tospecifying it inline, yet it is both more efficient, since browsers areable to cache the file, and cleaner, since the page is not clutteredwith generic functions.

[0875] 3.4 Events

[0876] 3.4.1 Event Wiring

[0877] Tap-and-talk multimodality can be enabled by coordinating theactivation of grammars with the onMouseDown event. The wiring script todo this will be generated by the Page based on the relationship betweencontrols (as specified in the ControlsToSpeechEnable property of the QAcontrol in).

[0878] For example, given an asp:TextBox and its companion QA controladding a grammar, the <input> and <reco> elements are output by eachcontrol's Render method. The wiring mechanism to add the grammaractivation command is performed by client-side script generated by thePage, which changes the attribute of the primary control to add theactivation command before any existing handler for the activation event:<!-- Control output --> <input id=“TextBox1” type=“text” .../> <recoid=“Reco1” ... /> <grammar src=“...” /> </reco> <!-- Page output --><script> TextBox1.onMouseDown= “Reco1.Start( );”+TextBox1.onMouseDown;</script>

[0879] By default, hook up is via onmousedown and onmouseup events, butboth StartEvent and StopEvent can be set by web page author.

[0880] The textbox output remains independent of this modification andthe event is processed as normal if other handlers were present.

[0881] 3.4.2 Page Class Properties

[0882] The Page also contains the following properties which areavailable to the script at runtime:

[0883] SML—a name/value pair for the ID of the control and it'sassociated SML returned by recognition.

[0884] SpokenText—a name/value pair for the ID of the control and it'sassociated recognized utterance

[0885] Confidence—a name/value pair for the ID of the control and it'sassociated confidence returned by the recognizer.

[0886] 4 RunSpeech Algorithm

[0887] The RunSpeech algorithm is used to drive dialog flow on theclient device. This may involve system prompting and dialog management(typically for voice-only dialogs), and/or processing of speech input(voice-only and multimodal dialogs). It is specified as a script filereferenced by URI from every relevant speech-enabled page (equivalent toinline embedded script).

[0888] Rendering of the page for voice only browsers is done in thefollowing manner:

[0889] The RunSpeech module or function works as follows (RunSpeech iscalled in response to document.onreadystate becoming “complete”):

[0890] (1) Find the first active QA control in speech index order(determining whether a QA control is active is explained below).

[0891] (2) If there is no active QA control, submit the page.

[0892] (3) Otherwise, run the QA control.

[0893] A QA control is considered active if and only if:

[0894] (1) The QA control's ClientTest either is not present or returnstrue, AND

[0895] (2) The QA control contains an active question control orstatement control (tested in source order), AND

[0896] (3) Either:

[0897] a. The QA control contains only statement controls, OR

[0898] b. At least one of the controls referenced by the QA control'sControlsToSpeechEnable has an empty or default value.

[0899] A question control is considered active if and only if:

[0900] (1) The question control's ClientTest either is not present orreturns true, AND

[0901] (2) The question control contains an active prompt object.

[0902] A prompt object is considered active if and only if:

[0903] (1) The prompt object's ClientTest either is not present orreturns true, AND

[0904] (2) The prompt object's Count is either not present, or is lessthan or equal to the Count of the parent question control.

[0905] A QA control is run as follows:

[0906] (1) Determine which question control or statement control isactive and increment its Count.

[0907] (2) If a statement control is active, play the prompt and exit.

[0908] (3) If a question control is active, play the prompt and startthe Recos for each active answer control and command control.

[0909] An answer control is considered active if and only if:

[0910] (1) The answer control's ClientTest either is not present orreturns true, AND

[0911] (2) Either:

[0912] a. The answer control was referenced in the active questioncontol's Answers string, OR

[0913] b. The answer control is in Scope

[0914] A command control is considered active if and only if:

[0915] (1) It is in Scope, AND

[0916] (2) There is not another command control of the same Type lowerin the scope tree.

[0917] RunSpeech relies on events to continue driving the dialog—asdescribed so far it would stop after running a single QA control. Eventhandlers are included for Prompt.OnComplete, Reco.OnReco,Reco.OnSilence, Reco.OnMaxTimeout, and Reco.OnNoReco. Each of these willbe described in turn.

[0918] RunSpeechOnComplete works as follows:

[0919] (1) If the active Prompt object has an OnClientComplete functionspecified, it is called.

[0920] (2) If the active Prompt object was contained within a statementcontrol, or a question control which had no active answer controls,RunSpeech is called.

[0921] RunSpeechOnReco works as follows:

[0922] (1) Some default binding happens—the SML tree is bound to the SMLattribute and the text is bound to the SpokenText attribute of eachcontrol in ControlsToSpeechEnable.

[0923] (2) If the confidence value of the recognition result is belowthe ConfidenceThreshold of the active answer control, the Confirmationlogic is run.

[0924] (3) Otherwise, if the active answer control has on OnClientRecofunction specified, it is called, and then RunSpeech is called.

[0925] RunSpeechOnReco is responsible for creating and setting the SML,SpokenText and Confidence properties of the ControlsToSpeechEnable. TheSML, SpokenText and Confidence properties are then available to scriptsat runtime.

[0926] RunSpeechOnSilence, RunSpeechOnMaxTimeout, and RunSpeechOnNoRecoall work the same way:

[0927] (1) The appropriate OnClientXXX function is called, if specified.

[0928] (2) RunSpeech is called.

[0929] Finally, the Confirmation logic works as follows:

[0930] (1) If the parent QA control of the active answer controlcontains any confirm controls, the first active confirm control is found(the activation of a confirm control is determined in exactly the sameway as the activation of a question control).

[0931] (2) If no active confirm control is found, RunSpeech is called.

[0932] (3) Else, the QA control is run, with the selected confirmcontrol as the active question control.

[0933] For multi-modal browsers, only the grammar loading and eventdispatching steps are carried out.

Appendix C

[0934] 1 Design Principles

[0935] In this embodiment, there is no concept of primary control tospeech-enable as it existed in Appendix B. The speech layer providesinput to the visual layer as well as explicit support for dialog flowmanagement. The semantic layer implements the logic needed forconfirmation and validation. In a multimodal interaction, the semanticlayer does not need to be used as confirmation and validation are visualand implemented using standard ASP.NET constructs. If desired though,the sematic layer can be updated with value changes made through visualor GUI interfaces in order that confirmation and validation can be stillimplemented.

[0936]FIG. 13 illustrates the speech controls inheritance diagram.

[0937] 2 Authoring Scenarios

[0938] The following provides examples of various forms of applicationscenarios.

[0939] 2.1 Multimodal App, Tap-And-Talk <speech:QA id=“qa1”runat=“server”> <Answers> <speech:Answer SemanticItem=“siText”ID=“answer1” XpathTrigger=“/sml/value” runat=“server”> </speech:Answer></Answers> <Reco StartEvent=“textbox1.onmousedown”StopEvent=“textbox1.onmouseup” ID=“reco1” Mode=“Single”> <Grammars><speech:Grammar Src=“http://mysite/mygrammar.grxml”ID=“Grammar1” ;runat=“server”> </speech:Grammar> </Grammars> </Reco></speech:QA>

[0940] 2.2 Multimodal App, Click-And-Wait-For-Recognition <speech:QAid=“qa1” runat=“server”> <Reco id=“reco1”StartEvent=“textbox1.onmousedown” mode=“automatic”> <Grammars><speech:grammar src=“htp://mysite/mygrammar.grxml”rnat=“server”></speech:grammar> </Grammars> </Reco> <Answers><speech:answer id=“answer1” XpathTrigger=“/sml/value”SemanticItem=“siText” runat=“server”> </speech:answer> </Answers></speech:QA>

[0941] 2.3 Multimodal App, Do-Field <speech:QA id=“qa1” runat=“server”><Reco id=“reco1” StartEvent=“dofieldButton.onmousedown”StopEvent=“dofieldButton.onmouseup” mode=“multiple”> <Grammars><speech:grammar src=“http://mysite/mylargegrammar.xml” runat=“server”></speech:grammar> </Grammars> </Reco> <Answers> <speech:answerid=“answer1” XpathTrigger=“/sml/value1” SemanticItem=“siOne”runat=“server”> </speech:answer> <speech:answer id=“answer2”XpathTrigger=“/sml/value2” SemanticItem=“siTwo” runat=“server”></speech:answer> speech:answer id=“answer3” XpathTrigger=“/sml/value3”SemanticItem=“siThree” runat=“server”> </speech:answer> <speech:answerid=“answer4” XpathTrigger=“/sml/value4” SemanticItem=“siFour”runat=“server”> </speech:answer> <speech:answer id=“answer5”XpathTrigger=“/sml/value5” SemanticItem=“siFive” runat=“server”></speech:answer> </Answers> </speech:QA>

[0942] 2.4 Voice Only App, Statement <speech:QA id=“welcome”PlayOnce=“true” runat=“server”> <Prompt InLineprompt=“Hellothere!”></Prompt> </speech:QA>

[0943] 2.5 Voice Only App, Simple Question <speech:QA id=“qa1”runat=“server”> <Reco id=“reco1” mode=“automatic”> <Grammars><speech:grammar src=“http://mysite/citygrammar.grxml”runat=“server”></speech:grammar> </Grammars> </Reco> <PromptInLinePrompt=“Which city do you want to fly to?”></Prompt> <Answers><speech:answer id=“answer1” XpathTrigger=“/sml/city”SemanticItem=“siCity” runat=“server”> </speech:answer> </Answers></speech:QA>

[0944] 2.6 Voice Only App, Question With Mixed-Initiative (OptionalAnswers) <speech:QA id=“qa1” runat=“server”> <Reco id=“reco1”mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/cityANDstate.xml” runat=“server”></speech:grammar></Grammars> </Reco> <Prompt InLinePrompt=“Which city do you want to flyto?”></Prompt> <Answers> <speech:answer id=“answer1”XpathTrigger=“/sml/city” SemanticItem=“siCity” runat=“server”></speech:answer> </Answers> <ExtraAnswers> <speech:answer id=“answer2”XpathTrigger=“/sml/state” SemanticItem=“siState” runat=“server”></speech:answer> </ExtraAnswers> </speech:QA>

[0945] 2.7 Voice Only App, Explicit Confirmation <speech:QA id=“qa1”runat=“server”> <Reco id=“reco1” mode=“automatic”> <Grammars><speech:grammar src=“http://mysite/citygrammar.xml” runat=“server”></speech:grammar> </Grammars> </Reco> <Prompt InLinePrompt=“Which citydo you want to fly to?”></Prompt> <Answers> <speech:answer id=“answer1”XpathTrigger=“/sml/city” SemanticItem=“siCity” confirmThreshold=“0.75”runat=“server”> </speech:answer> </Answers> </speech:QA> <speech:QAid=“qa2” runat=“server” xpathAcceptConfirms=“/sml/accept”xpathDenyConfirms=“/sml/deny”> <Prompt InLinePrompt=“Did you say<SALT:value>textbox1.value</SALT:value>”></Prompt> <Reco id=“reco1”mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/yes_no_city.xml” runat=“server”></speech:grammar></Grammars> </Reco> <Confirms> <speech:answer id=“answer2”XpathTrigger=“/sml/city” SemanticItem=“siCity” confirmThreshold=“0.75”runat=“server”> </speech:answer> </Confirms> </speech:QA>

[0946] 2.8 Voice Only App, Short Time-Out Confirmation <speech:QAid=“qa1” runat=“server” xpathAcceptConfirms=“/sml/accept”xpathDenyConfirms=“/sml/deny” firstInitialTimeout=“500”> <PromptInLinePrompt=“Did you say<SALT:value>textbox1.value</SALT:value>”></Prompt> <Reco id=“reco1”InitialTimeout=“350” mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/yes_no_city.grxml” runat=“server”></speech:grammar></Grammars> </Reco> <Confirms> <speech:answer XpathTrigger=“/sml/city”SemanticItem=“siCity” confirmThreshold=“0.75” runat=“server”></speech:answer> </Confirms> </speech:QA>

[0947] 2.9 Voice Only App, Commands <speech:QA id=“qa1” runat=“server”><Prompt id=“prompt1” InLinePrompt=“Where do you want to flyto?”></Prompt> <Reco id=“reco1” mode=“automatic”> <Grammars><speech:grammar src=“http://mysite/city.grxml”runat=“server”></speech:grammar> </Grammars> </Reco> <Answers><speech:answer id=“answer1” XpathTrigger=“/sml/city”SemanticItem=“siCity” runat=“server”></speech:answer> </Answers></speech:QA> <speech:Command id=“command1” type=“cancel” scope=“qa1”OnClientCommand=“myCommand” runat=“server”></speech:Command> <script>function myCommand( ) { CallControl.Hangup( ); } </script>

[0948] 2.10 Voice Only App, Prompt Selection <speech:QA id=“qa1”runat=“server”> <Prompt id=“prompt1” InLinePrompt=“Where do you want tofly to?”></Prompt> <Reco id=“reco1” mode=“automatic”> <Grammars><speech:grammar src=“http://mysite/city.grxml”runat=“server”></speech:grammar> </Grammars> </Reco> <Answers><speech:answer id=“answer1” XpathTrigger=“/sml/city”SemanticItem=“siCity” runat=“server”></speech:answer> </Answers></speech:QA> <speech:Command id=“command1” type=“cancel” scope=“qa1”OnClientCommand=“myCommand” runat=“server”></speech:Command> <script>function myCommand( ) { CallControl.Hangup( ); } </script> <speech:qaid=“qa1” runat=“server”> <Prompt id=“prompt1”PromptSelectFunction=“promptSelection”/> <Reco id=“reco1”mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/city.xml” runat=“server”></speech:grammar></Grammars> </Reco> <Answers> <speech:answer id=“answer1”XpathTrigger=“/sml/city” SemanticItem=“siCity”runat=“server”></speech:answer> </Answers> </speech:qa> <script>function promptSelection (lastCommandOrException, count, answerArray) {if (lastCommandOrException = = “Silence”) { return “Sorry, I couldn'thear you. Please speak louder. Where do you want to fly to?”; } else if(count>3) { return “Communication problems are preventing me fromhearing the arrival city. Please try again later.”; } return “Where doyou want to fly to?”; //Default prompt } } </script>

[0949] 2.11 Voice Only App, Implicit Confirmation <speech:qa id=“qa1”runat=“server” xpathDenyConfirms=“/sml/deny”xpathAcceptConfirms=“/sml/accept”> <Prompt id=“prompt1”PromptSelectFunction=“promptSelection”></Prompt> <Reco id=“reco1”mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/yes_no_city.xml” runat=“server”></speech:grammar></Grammars> </Reco> <Answers> <speech:answer id=“answer1”XpathTrigger=“/sml/date” SemItem=“siDate”runat=“server”></speech:answer> </Answers> <Confirms> <speech:answerid=“confirm1” XpathTrigger=“/sml/city” SemItem=“siCity”runat=“server”></speech:answer> </Confirms> </speech:qa> <script>function promptSelection (lastCommandOrException, count,SemanticItemList) { var myPrompt = “ ”; if(SemanticItemList[“siCity”].value != null) { myPrompt = “Flying from ” +SemanticItemList[“siCity”].value + “. ”; myPrompt += “On what date?”; }else { myPrompt = “On what date?”; } return myPrompt; } </script>

[0950] 2.12 Voice Only App, QA with Reco and Dtmf <speech:qa id=“qa1”runat=“server”> <Prompt id=“prompt1” InLinePrompt=“Press or say one ifyou accept the charges, two if you don't.”></Prompt> <Reco id=“reco1”mode=“automatic”> <Grammars> <speech:grammarsrc=“http://mysite/acceptCharges.xml” runat=“server”></speech:grammar></Grammars> </Reco> <Dtmf smlContext=“sml/accept”></Dtmf> <Answers><speech:answer id=“answer1” XpathTrigger=“/sml/accept”SemanticItem=“siAccept” runat=“server”></speech:answer> </Answers></speech:qa>

[0951] 2.13 Voice-Only App, Record-Only QA <speech:qa id=“qa1”runat=“server”> <Answers> <speech:answer id=“a1”XpathTrigger=“/SML/@recordlocation” SemanticItem=“foo”runat=“server”></speech:answer> </Answers> <Reco id=“recordonly”><record beep=“true”></record> </Reco> </speech:qa></FORM>

[0952] 3 Design Details

[0953] 3.1 QA Activation (Voice-Only)

[0954] QA are tested for activeness in SpeechIndex order (see run-timebehavior).

[0955] A QA is active when clientActivationFunction returns true AND

[0956] If the Answers array is non empty, the SemanticItems pointed toby the set of Answers are empty OR

[0957] If the answers array is empty, at least one item in the Confirmarray does need confirmation

[0958] A QA can have only Answers (normal question: Where do you want togo?), only Confirms (explicit confirmation: Did you say Boston? or shorttime-out confirmation: Boston.), both (implicit confirmation: When doyou want to fly to Boston?) or none (statement: Welcome to myapplication!).

[0959] A QA can have extra answers even if it has no answers (e.g.,mixed initiative).

[0960] 3.2 Answer, Confirm.

[0961] Upon recognition, commands are processed first, followed byAnswers, ExtraAnswers and Confirms.

[0962] A target element (e.g. textbox1.value) can be in one of thesestates: empty, invalid, needsConfirmation, confirmed. A target is emptybefore any recognition result is associated with this item, or if theitem has been cleared. A target is in needsConfirmation state when arecognition result has been associated with it, but the confidence levelis below the confirmationThreshold for this item. And a target isconfirmed when either a recognition result has been associated with itwith a confidence level high enough or a confirmation loop set it tothis state explicitly.

[0963] Answers are therefore responsible for setting the value in thetarget element and the confidence level (this is done in a semanticlayer). Confirms are responsible for confirming the item, clearing it orsetting it to a new value (with a new confidence level).

[0964] 3.3 Command Execution (and Scope)

[0965] Commands specify a scope and are active for all QA's within thatscope. The default processing of a command is to set the current QA'slastCommandException to the command's type. If the command specifies aGrammar, this grammar is activated in parallel with any grammars in thecurrent Reco object. QAs can be modal (allowCommands=false), in whichcase, no commands will be processed for that particular QA.

[0966] 3.4 Validators

[0967] A CompareValidator will be active when the value of theSemanticItemToValidate it refers to has not been validated by thisvalidator. If SemanticItemToCompare is specified (rather thanValueToCompare), then the CompareValidator will only be active if thevalue of the SemanticItemToCompare is non-empty (i.e. if it has beenassigned a value by a previous QA).

[0968] A CustomValidator will be active when the value of theSemanticItemToValidate it refers to has not been validated by thisvalidator.

[0969] 4 Run Time Behavior

[0970] 4.1 Client Detection

[0971] The speech controls do pay attention to the variety of clientthat they are rendering for. If the client doesn't support SALT, thecontrols won't render any speech-related tags or script. Clientdetection is done by checking the browser capabilities and detectingwhether it's a voice-only client (browser is Quadrant), or multimodal(IE, PocketIE, etc, with SALT support).

[0972] Hands-free is not a mode in the client, but rather anapplication-specific modality, and therefore the only support requiredis SALT (as in multimodal). Hands-free operation is thereforeswitched-on by application logic.

[0973] 4.2 Multimodal

[0974] Support for multimodal applications is built in the speechcontrols. In multimodal operations commands, dtmf, confirm, prompts, etcdo not make sense from an interaction point of view, so they won't berendered. Tap-and-talk (or any other type of interaction, likeclick-and-wait-for-recognition) is enabled by hooking up the calls tostart and stop recognition with GUI events using the Reco objectattributes startElement/startEvent and stopElement/stopEvent, plus theReco object mode attribute.

[0975] During render time, the speech controls are passed informationspecifying whether the client is a voice-only client or multmodalclient. If the client is multimodal, the rendering process hooks thecall to start recognition to the GUI event specified by the StartEventattribute of the Reco object. The rendering process also hooks the callto stop recognition to the GUI event specified by the StopEventattribute of the Reco object.

[0976] The multimodal client needs a mechanism which will invokeauthor-specified functions to handle speech-related events (e.g.,timeouts) or recognition processing. This mechanism is the Multimodal.jsscript. Multimodal.js is specified in an external script file and loadedby a single line generated by server-side rendering, e.g., <scriptlanguage=‘“javascript” src=“/scripts/Multimodal.js” />

[0977] This method mirrors the ASP.NET way of generating ‘system’client-side script loaded via URI. Linking to an external script isfunctionally equivalent to specifying it inline, yet is more efficientsince clients are able to cache the file, and cleaner, since the page isnot clutered with generic functions.

[0978] 4.3 Voice-Only

[0979] 4.3.1 Runtime Script (RunSpeech)

[0980] Unlike in a multimodal interaction, where the user initiates allspeech input by clicking/selecting visual elements in the GUI, amechanism is needed to provide voice-only clients with the informationnecessary to properly render speech-enabled ASP.NET pages. Such amechanism must guarantee the execution of dialog logic and maintainstate of user prompting and grammar activation as specified by theauthor.

[0981] The mechanism used by the Speech Controls is a client-side script(RunSpeech.js) that relies upon the SpeechIndex attribute of the QAcontrol, plus the flow control mechanisms built in the framework(ClientActivationFunction, default activation rules, etc.). RunSpeech isloaded via URI similar to the loading mechanism of Multimodal.js asdescribed above.

[0982] 4.3.2 SpeechIndex

[0983] SpeechIndex is an absolute ordering index within a namingcontainer.

[0984] If more than one speech control has the same SpeechIndex, theyare activated in source order. In situations where some controls haveSpeechIndex specified and some controls do not, those with SpeechIndexwill be activated first, then the rest in source order.

[0985] NOTE: Speech index is automatically set to 0 for new controls.Dialog designers should leave room in their numbering scheme to insertnew QA's later. Begin with a midrange integer and increment by 100, forexample. For example number QA's 1000, 1100, 1200 instead of 1, 2, 3.this leaves room for a large number of QA's at any point the dialog andplenty of room to add QA's at the beginning.

[0986] 4.3.3 ClientActivationFunction

[0987] clientActivationFunction specifies a client-side script functionwhich returns a boolean value to determine when this control isconsidered available for selection by the run-time control selectionalgorithm. If not specified, it defaults to true (control is active).

[0988] The system strategy can therefore be changed by using this as acondition to activate or de-activate QAs more sensitively thanSpeechIndex. If not specified, the QA is considered available foractivation.

[0989] 4.3.4 Count

[0990] Count is a property of the QA control that indicates how manytimes that control has been activated consecutively. This Count propertywill be reset if the previously active QA is different that the currentQA (same applies for Validators), otherwise, it is incremented by one.The Count property is exposed to application developers through thePromptSectionFunction of the Prompt object.

[0991] Controls Reference

[0992] General Authoring Notes

[0993] 1. Script References are not Validated at Render Time.

[0994] The Speech Controls and objects described in this section containattributes whose values are references to script functions written bythe dialog author. These functions are executed on client devices inresponse to speech-related events (e.g. expiration of timeout) or as runtime processing (e.g. modification of prompt text prior to playback).Render time validation is not performed on script references, i.e., nochecks for existence of script functions is done during rendering ofcontrols. If an attribute contains a reference to a client-side scriptfunction and the function does not exist, client-side exceptions will bethrown.

[0995] In voice-only mode, script functions generating exceptions duringruntime will cause a redirection to the error page defined in theWeb.config file. If no error page is defined, RunSpeech will continue toexecute without reporting the exception.

[0996] 2. All Speech Controls Should be Contained Within ASP.NET <form>Tag or Equivalant.

[0997] The Speech Control described in this section must all be placedin ASP.NET web pages inside the <form> tag. Behavior of controls placedoutside the <form> tag is undefined.

[0998] 3. Client-Side Script References must Refer to Function and notInclude Parenthes.

[0999] Using the PromptSelectFunction as an example. the following iscorrect syntax:

[1000] <Prompt id=“P1” PromptSelectFunction=“mySelectFunction”/>//using“mySelectFunction( )” is incorrect syntax

[1001] 4. IE Requires Exact Cases when Running Jscript.

[1002] Therefore, the case for event values specified in the StartEventand StopEvent attributes of the Prompt object must be exactly as thoseevents are defined. This happens to be all lowercase letters for moststandard IE events. For example, the onmouseup and onmousedown eventsmust be specified in all lowercase letters.

[1003] 5. All Speech Controls Expose the Common Attribute id.

[1004] 6. Behavior of Visible and Enabled Properties of Speech Controls.

[1005] Setting the visible or enabled properies of Speech Controls to“False” will cause them not to render.

[1006] 7. Mimimum Client Requirements

[1007] In one embodiment, clients must be running IE6.0 or greater andJScript 5.5 or greater for speech controls and associated scriptfunctions to work properly.

[1008] 8. Rendering <smex> to Telserver

[1009] The speech controls automatically handle rendering <smex> tags tothe telephony server on every page as is required by the server. In oneembodiiment, smex tags are rendered whether the client is the tel serveror the desktop client.

[1010] 5 Global Application Settings

[1011] Speech Controls provide mechanisms that allow dialog authors tospecify values to control properties on an application or page basis.

[1012] 5.1 Application-Level Settings

[1013] 5.1.1 Application Global Variables

[1014] Dialog authors may use their application's Web.config file to setvalues of global variables for speech-enabled web applications. Thevalues of the global variables persist throughout the entrie lifetime ofthe web application. ‘Errorpage’ is the only global variable that may bespecified and is set for the application during render time.<appSettings> <add key=“errorpage” value=“...” /> </appSettings>

[1015] The <appSettings> tag must be placed one level inside the<configuration> tag within the Web.config file.

[1016] The errorpage key specifies a URI to a default error page.Redirection to this error page will occur during run time when thespeech platform or the DTMF engine returns an error. A default errorpage is included with the SDK; the user can also create a custom errorpage.

[1017] Note: Developers who create their own error page must callwindow.close at the bottom of the error page in the voice only case inorder to release the call.

[1018] 5.1.2 Application-Level Setting of Common Control Properties

[1019] Dialog authors may use their application's Web.config file to setvalues of common control properties and have those values persist duringthe lifetime of the web application. For example, an author may wish usethe Web.config file to set the maxTimeout value for Reco objects intheir application. The properties are set in the Web.config file usingthe following syntax: <configuration> <SpeechStyleSheet> <Styleid=“style1” > <QA allowCommands=“false” > ... <Prompt bargein=“false”... /> <Reco maxTimeout=“5000”... /> <Dtmf preFlush=“true” ... /><Answers confirmThreshold=“0.80” ... /> <ExtraAnswersconfirmThreshold=“0.80” .../> <Confirms confirmThreshold=“0.80”... /></QA> <Command .../> <CustomValidator .../> <CompareValidator .../><SemanticItem .../> </Style> </SpeechStyleSheet> </configuration>

[1020] The Reco corresponding Reco object would reference the “style1”Style:

[1021] <Reco id=“reco1” . . . StyleReference=“style1” . . . />

[1022] If the Style id is “globalstyle,” the property values set in theStyle apply application-wide to pertinent controls. So, in the aboveexample, if id=“” (or the property is omitted from the Style tag), amaxTimeout of 5000 milliseconds will be used for all Reco objects in theapplication (uless overridden).

[1023] For a complete list of properties which are settable through theSpeechStyleSheet, see below.

[1024] 6 StyleSheet Control

[1025] The StyleSheet control allows dialog authors to set values tocommon control properties at a page-level scope. The StyleSheet controlis a collection of Style objects. The Style object exposes properties ofeach control that are settable on a page-level basis. The StyleSheetcontrol is rendered for both multimodal and voice-only modes. Anexception will be thrown if the StyleSheet control contains an objectwhich is not a Style object. class StyleSheet : SpeechControl { stringid{get; set;}; StyleCollection Styles{get;}; }

[1026] 6.1 StyleSheet Properties

[1027] Styles

[1028] Optional. Used in both multimodal and voice-only modes. TheStyles property is a collection of Style objects used to set propertyvalues for Speech Controls and their objects. The property values lastduring the lifetime of the current page.

[1029] 7 Style Object

[1030] The Style object is used to set property values for SpeechControls and their objects. The property values last during the lifetimeof the current page. class Style : Control { string id{get; set;};string StyleReference{get; set;}; QAStyle QA{get; set;}; CommandStyleCommand{get; set;}; CustomValidatorStyle CustomValidator{get; set;};CompareValidatorStyle CompareValidator{get; set;}; SemanticItemStyleSemanticItem{get; set;}; }

[1031] 7.1 Style Properties

[1032] id

[1033] Required. The programmatic name of the Style object.

[1034] StyleReference

[1035] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the StyleSheet control willsearch for the named Style object and also set property values specifiedin the named Style. An exception is thrown for an invalidStyleReference.

[1036] For every property of a speech control with a StyleReference, thevalue is determined as follows:

[1037] 1. the value is set directly in the speech control

[1038] 2. the style object directly referenced

[1039] 3. any style referenced by a style

[1040] 4. the global style object

[1041] 5. the speech control default value.

[1042] The following example sets shows two QA properties are set usingStyleReference: <speech:StyleSheet id=“SS”> <speech:Styleid=“base_style” > <QA OnClientActive=“myOnClientActive”/></speech:Style> <speech:Style id=“derived_style”StyleReference=“base_style”> <QA PlayOnce=“true”/> </speech:Style></speech:StyleSheet>

[1043] QA

[1044] Optional. The QA property of the Style object is used to setproperty values for all QA controls on a page that reference this Style.The following example shows how to set the AllowCommands and PlayOnceproperties for the QA controls that reference this Style:<speech:StyleSheet id=“SS1”> <speech:Style id=“WelcomePageQA_Style” ><QA AllowCommands=“false” PlayOnce=“true”/> </speech:Style></speech:StyleSheet> <QA id=“...” StyleReference=“WelcomePageQA_Style”.../> The next example shows how to set the bargein property of allPrompt objects on a given page using Params: <speech:StyleSheetid=“SS2”> <Style Name=“Style1”> <QA> <Answers ConfirmThreshold=“0.8”Reject=“0.4”/> <Prompt> <Params> <Param name=“BargeinType”value=“grammar”/> <Param name=“foo” value=“bar” /> <Params> </Prompt></QA> </Style> </speech:StyleSheet>

[1045] Command

[1046] Optional. The Command property of the Style object is used to setproperty values for all Command controls on a page that reference thisStyle.

[1047] CustomValidator

[1048] Optional. The CustomValidator property of the Style object isused to set property values for all CustomValidator controls on a pagethat reference this Style.

[1049] CompareValidator

[1050] Optional. The CompareValidator property of the Style object isused to set property values for all CompareValidator controls on a pagethat reference this Style.

[1051] SemanticItem

[1052] Optional. The SemanticItem property of the Style object is usedto set property values for all SemanticItem controls on a page thereference this Style. The following properties may be set using theStyle object.

[1053] QA Properties

[1054] AllowCommands

[1055] PlayOnce

[1056] XpathAcceptConfirms

[1057] XpathDenyConfirms

[1058] AcceptRejectThreshold

[1059] DenyRejectThreshold

[1060] FirstInitialTimeout

[1061] ConfirmByOmission

[1062] ConfirmIfEqual

[1063] OnClientActive

[1064] OnClientListening

[1065] OnClientComplete.

[1066] Prompt Properties

[1067] These apply to Prompts in QA, CompareValidator, CustomValidatorand Command controls.

[1068] Bargein

[1069] OnClientBookmark

[1070] OnClientError

[1071] Prefetch

[1072] Type

[1073] Lang

[1074] Params

[1075] Reco Properties

[1076] StartEvent

[1077] StopEvent

[1078] Mode

[1079] InitialTimeout

[1080] BabbleTimeout

[1081] MaxTimeout

[1082] EndSilence

[1083] Reject

[1084] OnClientSpeechDetected

[1085] OnClientSilence

[1086] OnClientNoReco

[1087] OnClientError

[1088] Lang

[1089] Params

[1090] Grammar Properties

[1091] These apply to both Reco and Dtmf grammars.

[1092] Type

[1093] Lang

[1094] Dtmf Properties

[1095] InitialTimeout

[1096] InterDigitTimeout

[1097] OnClientSilence

[1098] OnClientKeyPress

[1099] OnClientError

[1100] Params

[1101] Answer Properties

[1102] These apply to the Answers, ExtraAnswers and Confirmscollections.

[1103] ConfirmThreshold

[1104] Reject

[1105] Command Properties

[1106] Scope

[1107] AcceptCommandThreshold

[1108] CompareValidator Properties

[1109] ValidationEvent

[1110] Operator

[1111] Type

[1112] InvalidateBoth

[1113] CustomValidator Properties

[1114] ValidationEvent

[1115] SemanticItem Properties

[1116] BindOnChange

[1117] 8 QA control

[1118] The QA control is responsible for querying the user with aprompt, starting a corresponding recognition object and processingrecognition results.

[1119] The QA control is rendered for both multimodal and voice-onlymodes. class QA : IndexedStyleReferenceSpeechControl { string id{get;set;}; int SpeechIndex{get; set;}; string ClientActivationFunction{get;set;}; string OnClientActive{get; set;}; string OnClientComplete{get;set;}; string OnClientListening{get; set;}; bool AllowCommands{get;set;}; bool PlayOnce{get; set;}; string XpathAcceptConfirms{get; set;};string XpathDenyConfirms{get; set;}; float AcceptRejectThreshold{get;set;}; float DenyRejectThreshold{get; set;}; floatFirstInitialTimeout{get; set;}; string StyleReference{get; set;}; boolConfirmByOmission{get; set;}; bool ConfirmIfEqual{get; set;};AnswerCollection Answers{get;}; AnswerCollection ExtraAnswers{get;};AnswerCollection Confirms{get;}; Prompt Prompt{get;}; Reco Reco{get;};Dtmf Dtmf{get;}; }.

[1120] 8.1 QA Properties

[1121] All properties of the QA control are available to the applicationdeveloper at design time.

[1122] SpeechIndex

[1123] Optional. Default is Zero, which is equivalent to no SpeechIndex.Only used in voice-only mode. Specifies the activation order of speechcontrols on a page and the activation order of composite controls. Allcontrols with SpeechIndex>0 will be run and then controls withSpeechIndex=0 will be run in source order. If more than one control hasthe same SpeechIndex, they are activated in source order. In situationswhere some controls specify SpeechIndex and some controls do not, thosewith SpeechIndex specified will be activated first, then the rest insource order. SpeechIndex values start at 1. An exception will be thrownfor non-valid values of SpeechIndex.

[1124] ClientActivationFunction

[1125] Optional. Only used in voice-only mode. Specifies a client-sidescript function which returns a Boolean value to determine when a QAcontrol is considered available for selection by the run-time controlselection algorithm. If not specified, it defaults to true (control isactive). The signature for ClientActivationFunction is as follows:

[1126] bool ClientActivationFunction (object lastActiveObj, stringlastCommandOrException, int count)

[1127] where:

[1128] lastActiveObj is the last active control, e.g. QA,CustomValidator or CompareValidator. For the first activated QA on apage, lastActiveObj will be null.

[1129] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) of the last active control. For thefirst activated QA on a page or if the last active control is avalidator, lastCommandOrException will be an empty string.

[1130] count number of times the last active QA has been activatedconsecutively, 1 if this is the first acvtive QA on the page. Countstarts at 1 and has no limit. However, for the first activated QA on apage, count will be set to zero.

[1131] OnClientActive

[1132] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script that will be called after RunSpeech determines thisQA is active (voice-only mode) or after the startEvent is fired (inmultimodal) and before processing the QA (e.g., playing a prompt orstarting recognition). The onClientActive function does not returnvalues. The signature for onClientActive is as follows:

[1133] function onClientActiveo(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1134] where:

[1135] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1136] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal.

[1137] Count is the number of times the QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. SemanticItemList For voice-only mode,SemanticItemList is an associative array that maps semantic item id tosemantic item objects. For multimodal, SemanticItemList will be null.

[1138] OnClientComplete

[1139] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script that will be called after execution of a QA(successfully or not) and before passing dialog control back to theRunSpeech algorithm (in voice-only) or the end user (in multimodal). TheOnClientComplete function is called before postbacks to the server forQAs whose AutoPostBack attribute of the Answer object is set to true.The onClientComplete function does not return values. The signature foronClientComplete is as follows:

[1140] function onClientComplete (string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1141] where:

[1142] eventsource is the id of the object (specified by Reco.StopEvent)whose event stopped the Reco associated with the QA (for multimodal).eventsource will be null in voice-only mode.

[1143] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal.

[1144] Count is the number of times the QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. SemanticItemList For voice-only mode,SemanticItemList is an associative array that maps semantic item id tosemantic item objects. For multimodal, SemanticItemList will be null.

[1145] OnClientListening

[1146] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script (function) that will be called/executed aftersuccessful start of the reco object. The main use is so the GUI canchange to show the user that they can start speaking. The function doesnot return any values. The signature for OnClientListening is asfollows:

[1147] function onClientListening(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1148] where:

[1149] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1150] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal.

[1151] Count is the number of times the QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. SemanticItemList For voice-only mode,SemanticItemList is an associative array that maps semantic item id tosemantic item objects. For multimodal, SemanticItemList will be null.

[1152] Note: In multimodal mode OnClientListening is only available ifauthor chooses to use StartEvent. If author decides to start recoprogrammatically, then on ClientListening is not called for the authorbecause the author can detect when reco.start returns successfully.

[1153] Note: OnClientListening is ignored when specified in QA's that donot contain reco objects.

[1154] AllowCommands

[1155] Optional. Only used in voice-only mode. Indicates whether or notCommands may be activated for a QA control. When AllowCommands is set tofalse, no commands may be activated. Defaults to true.

[1156] PlayOnce

[1157] Optional. Only used in voice-only mode. Specifies whether or nota QA may be activated more than once per page. If not specified,PlayOnce is set to false. Playonce=“true” may be used to authorstatements like welcoming prompts. When a QA is reduced to a statement(no reco), setting PlayOnce=“false” will provide dialog authors with thecapability to enable a “repeat” functionality on a page that reads emailmessages.

[1158] XpathAcceptConfirms

[1159] Optional. Only used in voice-only mode. Specifies the path in thesml document (recognition result) that indicates the confirm items wereaccepted. Required if Confirms are specified. If XpathAcceptConfirms isspecified without a Confirm being specified it is ignored.XpathAcceptConfirms must be a valid xml path. An invalid xml path willcause a redirection to the default error page during run time.

[1160] XpathDenyConfirms

[1161] Optional. Used only in voice-only mode. Specifies the path in thesml document that indicates the confirm items were denied. Required ifConfirms are specified. If a Confirm is specified and XpathDenyConfirmsis not set an exception is thrown. If XpathDenyConfirms is specifiedwithout a Confirm being specified it is ignored. XpathDenyConfirms mustbe a valid xml path. An invalid xml path will cause a redirection to thedefault error page during run time.

[1162] AcceptRejectThreshold

[1163] Optional. Used only in voice-only mode. If confidence for anaccept confirm is not above this threshold no action will be taken.Legal values are 0-1 and are platform specific. An exception will bethrown for out of range AcceptRejectThreshold values. Default is zero

[1164] DenyRejectThreshold

[1165] Optional. Used only in voice-only mode. If confidence for a denyconfirm is not above this threshold no action will be taken. Legalvalues are 0-1 and are platform specific. An exception will be thrownfor out of range DenyRejectThreshold values. Default is zero.

[1166] FirstInitialTimeout

[1167] Optional. Only used in voice-only mode. Specifies the initialtimeout in msec for the QA when count==1. The status of theTargetElements specified in the Confirms answer list will be set to“Confirmed” if no speech is detected within firstInitialTimeoutmilliseconds. If not specified the default value of firstInitialTimeoutis 0, which means that silence does not imply confirmation of theAnswer. An exception will be thrown if firstInitialTimeout is specifiedfor a QA that does not contain Confirms. An exception will be thrown fornegative values of FirstInitialTimeout.

[1168] StyleReference

[1169] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the QA control will searchfor the named Style control and will use any property values specifiedon the Style as default values for its own properties. Explicitly setproperty values on the control will override those set on the Style.

[1170] ConfirmByOmission

[1171] Optional. Used only in voice-only mode. Default is true. Thisflag controls confirmation of more than one item. If the flag is set totrue, then any semantic items whose xpath is not present in the recoresult, will be set to Confirmed. ConfirmByOmission enables thefollowing scenario:

[1172] (ConfirmByOmission=true)

[1173] Q: Flying from?

[1174] A: Boston.

[1175] Q: Flying to?

[1176] A: Seattle.

[1177] Q: From Boston to Seattle?

[1178] A: From NY.

[1179] (Seattle is confirmed as destination city).

[1180] ConfirmIfEqual

[1181] Optional. Used only in voice-only mode. Default is true. Thisflag controls the processing of corrections during confirmation. IfConfirmIfEqual is true and a recognized correction is the same valuealready in the semantic item, the item is maked confirmed. IfConfirmIfEqual is false and a recognized correction is the same valuealready in the semantic item, the item is maked as needing confirmation.

[1182] Answers

[1183] Optional. An array of answer objects. This list of objects isused both to determine activation, and to carry out semantic processinglogic. An exception will be thrown if an Answers collection containsnon-answer objects.

[1184] ExtraAnswers

[1185] Optional. An array of answer objects. These items are not usedfor activation, but they are taken into account when processingrecognition results. If an ExtraAnswer is recognized, it will overwritethe semantic item it points to, even if it was previously confirmed.

[1186] Confirms

[1187] Optional. An array of answer objects. These items are used foractivation if the answers array is empty and they affect theconfirmation logic.

[1188] Prompt

[1189] Optional for multimodal. Required for voice-only. An exception isthrown if a Prompt is not specified in voice-only mode.

[1190] Reco

[1191] Optional for multimodal and voice-only. Typically, only one recocan be specified in a QA.

[1192] Dtmf

[1193] Optional. Only used in voice-only mode. Typically, only one Dtmfcan be specified in a QA.

[1194] 9 Command Control

[1195] The Command control provides a way for obtaining user input thatis not an answer to the question at hand (eg, Help, Repeat, Cancel), andwhich does not map to textual input into primary controls. A Commandspecifies an activation scope, which means that its grammar is active(in parallel with the current recognition grammar) for every QA withinthat scope. Commands have a type attribute which is used to implement achain of events: Commands of the same type at QAs lower in scope canoverride superior commands with context-sensitive behavior (and evendifferent/extended grammars if necessary) and to notify the QA whatcommand was uttered (via the reason parameter)

[1196] Commands are Not Rendered for Multimodal Mode. class Command :SpeechControl { string id{get; set;}; string Scope{get; set;}; stringType{get; set;}; string XpathTrigger{get; set;}; floatAcceptCommandThreshold{get; set;}; string OnClientCommand{get; set;};bool AutoPostBack{get; set;}; TriggeredEventHandler OnTriggered; stringStyleReference{get; set;}; Prompt Prompt{get;}; Grammar Grammar{get;};Grammar DtmfGrammar{get;}; }

[1197] 9.1 Command Properties

[1198] All properties of the Command control are available to theapplication developer at design time.

[1199] Scope

[1200] Required. Only used in voice-only mode. Specifies the id of a QAor other ASP.NET control (e.g., form, panel, or table). Scope is used inCommands to specify when the Command's grammars will be active.Exceptions are thrown if Scope is invalid or not specified.

[1201] Type

[1202] Required. Only used in voice-only mode. Specifies the type ofcommand (eg ‘help’, ‘cancel’ etc.) in order to allow the overriding ofidentically typed commands at lower levels of the scope tree. Any stringvalue is possible in this attribute, so it is up to the author to ensurethat types are used correctly. An exception is thrown if Type is notspecified.

[1203] Note: An exception will be thrown if more than 1 Command of sameType has the same Scope. For example, 2 Type=“Help” Commands for thesame QA (Scope=“QA1”).

[1204] AcceptCommandThreshold

[1205] Optional. Only used in voice-only mode. Specifies the minimumconfidence level of recognition that is necessary to trigger the command(this is likely to be used when higher than usual confidence isrequired, e.g. before executing the result of a ‘Cancel’ command). Legalvalues are 0-1. Default value is 0. Exceptions will be thrown for out ofrange AcceptCommandThreshold values.

[1206] If a command is matched (its xpathTrigger is present in therecoResult) no further commands will be processed, and no Answers,ExtraAnswers, Confirms, etc. will be processed. Then, if the confidenceof the node specified by XpathTrigger is greater than or equal to theacceptThreshold, the active QAs LastCommandOrException is set to theCommand's type, and the Command's onCommand function is called.Otherwise (if the confidence of the node is less than theacceptThreshold) the active QAs LastCommandOrException is set to“NoReco” and the active QAs Reco's OnClientNoReco function is called.

[1207] XpathTrigger

[1208] Required. Only used in voice-only mode. SML document path thattriggers this command. An exception will be thrown if XpathTrigger isnot specified. XpathTrigger must be a valid xml path. An invalid xmlpath will cause a redirection to the default error page during run time.

[1209] OnClientCommand

[1210] Optional. Only used voice-only mode. Specifies the client-sidescript function to execute on recognition of the Command's grammar. Thefunction does not return any values. The signature for OnClientCommandis as follows:

function OnClientCommand(XMLNode smlNode)

[1211] where: smlNode is the matched SML node.

[1212] Note: If AutoPostBack is set to true, the OnClientCommandfunction is executed before posting back to the server. If the authorwishes to persist any page state across postback, the OnClientCommandfunction is a good place to invoke the ClientViewState object ofRunSpeech.

[1213] AutoPostBack

[1214] Optional. Only used in voice-only mode. Specifies whether or notthe Command control posts back to the server each time a Command grammaris recognized. Default is false. If set to true, the server-sideTriggered event is fired.

[1215] The internal state of the voice-only page is maintainedautomatically during postback. Authors may use the ClientViewStateobject of RunSpeech to declare and set additional values they wish topersist across postbacks.

[1216] OnTriggered

[1217] Optional. Only used in voice-only mode. Specifies a server-sidescript function to be executed when the Triggered event is fired (seeautopostback attribute above). This handler must have the form (inC#—the signature would look slightly different in other languages):

[1218] void myFunction (object sender, CommandTriggeredEventArgs e);

[1219] The handler can be assigned to in two differentways—declaratively:

[1220] <speech:Command . . . OnTriggered=“myFunction” . . . />

[1221] or programmatically:

[1222] Command.Triggered+=new TriggeredEventHandler(myFunction)

[1223] TriggeredEventHandler is what is called a “delegate”—it basicallyspecifies the signature of functions which can handle its associatedevent type. It looks like this:

[1224] public delegate void TriggeredEventHandler(object sender,TriggeredEventArgs e

[1225] where:

[1226] TriggeredEventArgs is a class derived from System.EventArgs whichcontains one public property, string Value.

[1227] An exception will be thrown if AutoPostBack is set to true and nohandler is specified for the Triggered event. An exception will bethrown if AutoPostBack is set to false and a handler is specified forthe Triggered event.

[1228] StyleReference

[1229] Optional. Only used in voice-only mode. Specifies the name of aStyle object. At render time, the QA control will search for the namedStyle control and will use any property values specified on the Style asdefault values for its own properties. Explicitly set property values onthe control will override those set on the Style.

[1230] Prompt

[1231] Optional. May be used to specify prompt to be played for globalcommands.

[1232] Grammar

[1233] Optional. The grammar object which will listen for the command.

[1234] Note: The grammar object is optional because the QA scoped bythis command may contain the rule that generates this command's Xpath.The author has the flexibility of specifying the rule in the QA controlor the Command control.

[1235] DtmfGrammar

[1236] Optional. The DtmfGrammar object which will activate the command.Available at run time.

[1237] Note: The DtmfGrammar object is optional because the QA scoped bythis command may contain the rule that generates this command's Xpath.The author has the flexibility of specifying the rule in the QA controlor the Command control. DtmfGrammars for all Commands along the QA'sscope chain will be combined into the Grammars collection for the QA'sDtmf object.

[1238] Speech Controls does not provide a set of common commands—e.g.,help, cancel, repeat.

[1239] 10 CompareValidator Control

[1240] This control compares two values, applying the operator, and ifthe comparison is false, invalidates the item specified bySemanticItemToValidate. Optionally, both items (ToCompare andToValidate) are invalidated. The CompareValidator is triggered on theclient by change or confirm events; however, validation prompts areplayed in SpeechIndex order.

[1241] The CompareValidator control is rendered for voice-only mode. Formultimodal, ASP.NET validator controls may be used. classCompareValidator : IndexedStyleReferenceSpeechControl { string id{get;set;}; int SpeechIndex{get; set;}; ValidationType Type{get; set;};string ValidationEvent{get; set;}; string SemanticItemToCompare{get;set;}; string ValueToCompare{get; set;}; stringSemanticItemToValidate{get; set;}; ValidationCompareOperatorOperator{get; set;}; bool InvalidateBoth{get; set;}; stringStyleReference{get; set;}; Prompt Prompt{get;}; }

[1242] 10.1 CompareValidator Properties

[1243] All properties of the CompareValidator control are only used invoice-only mode and are available to the application developer at designtime.

[1244] SpeechIndex

[1245] Optional. Specifies the activation order of CompareValidatorcontrols on a page. If more than one control has the same SpeechIndex,they are activated in source order. In situations where some controlsspecify SpeechIndex and some controls do not, those with SpeechIndexspecified will be activated first, then the rest in source order.SpeechIndex values start at 1. An exception will be thrown for non-validvalues of SpeechIndex.

[1246] Type

[1247] Required. Sets the datatype of the comparison. Legal values are“String”, “Integer”, “Double”, “Date”, and “Currency”. Default value is“String”.

[1248] ValidationEvent

[1249] Default is “onconfirmed”. ValidationEvent may be set to one oftwo values, either “onchange” or “onconfirmed”.

[1250] If ValidationEvent is set to “onchanged”, the CompareValidatorwill be run each time the value of the Text property of the associatedSemanticItem changes. The CompareValidator control will be run beforethe SemanticItem's OnChanged handler is called. The SemanticItem'sOnChanged handler will only be called if the CompareValidator doesindeed validate the changed data. If the CompareValidator invalidatesthe data, the State of the SemanticItem is set to Empty and theOnChanged handler is not called.

[1251] If ValidationEvent is set to “onconfirmed”, the CompareValidatorwill be run each time the State of the associated SemanticItem changesto Confirmed. The CompareValidator control will be run before theSemanticItem's OnConfimed handler is called. The SemanticItem'sOnConfirmed handler will only be called if the CompareValidator doesindeed validate the changed data. If the CompareValidator invalidatesthe data, the State of the SemanticItem is set to Empty and theOnConfirmed handler is not called.

[1252] After processing all SemanticItems involved a recognition turn,RunSpeech starts again. At that point, the previously failed validatorswill be active and RunSpeech will select the first QA/Validator that isactive in SpeechIndex order. It is the author's responsibility to placethe validator controls directly before the QA control that collects theanswer for the SemanticItem in order to get the correct behavior.

[1253] SemanticItemToCompare

[1254] Optional. Either SemanticItemToCompare or ValueToCompare must bespecified. Specifies the Id of the SemanticItem which will be used asthe basis for the comparison. Available at design time and run time. Anexception will be thrown if either SemanticItemToCompare orValueToCompare is not specified.

[1255] ValueToCompare

[1256] Optional. Either SemanticItemToCompare or ValueToCompare must bespecified. Specifies the value to be used as the basis for thecomparison. The author may wish to specify the value here instead oftaking the value from the semantic item. If both ValueToCompare andSemanticItemToCompare are set, SemanticItemToCompare takes precedence.An exception will be thrown if either SemanticItemToCompare orValueToCompare is not specified. An exception will be thrown ifValueToCompare can not be converted to a valid Type.

[1257] SemanticItemToValidate

[1258] Required. Specifies the Id of the SemanticItem that is beingvalidated against either ValueToCompare or SemanticItemToCompare. Anexception will be thrown for unspecified SemanticItemToValidate.

[1259] Operator

[1260] Optional. One of “Equal”, “NotEqual”, “GreaterThan”,GreaterThanEqual”, “LesserThan”, “LesserThanEqual”, “DataTypeCheck”.Default value is “Equal”. The values are compared in the followingorder: Value to Validate [operator] ValueToCompare.

[1261] InvalidateBoth

[1262] Optional. If true, both SemanticItemToCompare andSemanticItemToValidate are marked Empty. Default is false (i.e.,invalidate only the SemanticItemtToInvalidate). IfSemanticItemToValidate has not been set (i.e. ValueToCompare has beenspecified), InvalidateBoth is ignored.

[1263] The following example illustrates the usage of the InvalidateBothattribute. The scenario is an itinerary application. The user hasalready been prompted and answered the question for departing city. Atthis point in the dialog an ASP.NET textbox control has been filled withthe recognition results (assume txtDepartureCity.Value=“Austin”).

[1264] The next QA prompts the user for the arrival city, theSemanticItem object binds to txtArrivalCity.Value. In response to theprompt, the user says “Boston”. However, the recognition engine returns“Austin” (e.g. arrival city is same as departing city).

[1265] The CompareValidator control may be used to direct the dialogflow in this case to re-prompt the user for both departing and arrivingcities: <CompareValidator  id=“compareCities” SpeechIndex=“5”Type=“String” SemanticItemToCompare=“si_DepartureCity”SemanticItemToValidate=“si_ArrvivalCity” Operator=“NotEqual”InvalidateBoth=“True” runat=“server” </CompareValidator>

[1266] StyleReference

[1267] Optional. Specifies the name of a Style object. At render time,the QA control will search for the named Style control and will use anyproperty values specified on the Style as default values for its ownproperties. Explicitly set property values on the CompareValidatorcontrol will override those set on the Style.

[1268] Prompt

[1269] Optional. Prompt to indicate the error.

[1270] 11 Customvalidator Control

[1271] The CustomValidator control is used to validate recognitionresults when complex validation algorithms are required. The controlallows dialog authors to specify their own validation routines. TheCustomValidator is triggered on the client by change or confirm events;however, validation prompts are played in SpeechIndex order.

[1272] The CustomValidator control is only rendered for voice-only mode.For multimodal, ASP.NET validator controls may be used. classCustomValidator : IndexedStyleReferenceSpeechControl { string id{get;set;}; int SpeechIndex{get; set;}; string ValidationEvent{get; set;};string SemanticItemToValidate{get; set;}; stringClientValidationFunction{get; set;}; string StyleReference{get; set;};Prompt Prompt{get;}; }

[1273] 11.1 CustomValidator Properties

[1274] All properties of the CustomValidator control are only used invoice-only mode and are available to the application developer at designtime.

[1275] SpeechIndex

[1276] Optional. Only used in voice-only mode. Specifies the activationorder of speech controls on a page and the activation order of compositecontrols. If more than one control has the same SpeechIndex, they areactivated in source order. In situations where some controls specifySpeechIndex and some controls do not, those with SpeechIndex specifiedwill be activated first, then the rest in source order. SpeechIndexvalues start at 1. An exception will be thrown for non-valid values ofSpeechIndex.

[1277] ValidationEvent

[1278] Default is “onconfirmed”. ValidationEvent may be set to one oftwo values, either “onchange” or “onconfirmed”.

[1279] If ValidationEvent is set to “onchanged”, the CustomValidatorwill be run each time the value of the Text property of the associatedSemanticItem changes. The CustomValidator control will be run before theSemanticItem's OnChanged handler is called. The SemanticItem's OnChangedhandler will only be called if the CustomValidator does indeed validatethe changed data. If the CustomValidator invalidates the data, the Stateof the SemanticItem is set to Empty and the OnChanged handler is notcalled.

[1280] If ValidationEvent is set to “onconfirmed”, the CustomValidatorwill be run each time the State of the associated SemanticItem changesto Confirmed. The CustomValidator control will be run before theSemanticItem's OnConfimed handler is called. The SemanticItem'sOnConfirmed handler will only be called if the CustomValidator doesindeed validate the changed data. If the CustomValidator invalidates thedata, the State of the SemanticItem is set to Empty and the OnConfirmedhandler is not called.

[1281] After processing all SemanticItems involved a recognition turn,RunSpeech starts again. At that point, the previously failed validatorswill be active and RunSpeech will select the first QA/Validator that isactive in SpeechIndex order. It is the author's responsibility to placethe validator controls directly before the QA control that collects theanswer for the SemanticItem in order to get the correct behavior.

[1282] SemanticItemToValidate

[1283] Required. Specifies the id of the SemanticItem that is beingvalidated. An exception will be thrown for unspecified SemanticItemToValidate.

[1284] ClientValidationFunction

[1285] Required. Specifies a function that checks the value of theSemanticItemToValidate.AttributeTovalidate and returns true or falseindicating whether the value is valid or invalid. The signature forClientValidationFunction is as follows:

[1286] bool ClientValidationFunction (string value)

[1287] where:

[1288] value is the contents of ElementToValidate.AttributeToValidate.

[1289] An exception will be thrown if ClientValidationFunction is notspecified

[1290] StyleReference

[1291] Optional. Specifies the name of a Style object. At render time,the QA control will search for the named Style control and will use anyproperty values specified on the Style as default values for its ownproperties. Explicitly set property values on the control will overridethose set on the Style.

[1292] Prompt

[1293] Optional. Prompt to indicate the error.

[1294] 12 Answer Object

[1295] The Answer object contains information on how to processrecognition results and bind the results to controls on an ASP.NET page.

[1296] How Answer object is used.

[1297] Voice-only mode.

[1298] The RunSpeech script uses the Answer object to perform answerprocessing on the client. Answer processing begins when the OnReco eventfired by the speech platform is received by the client. The resultantSML document returned by the speech platform is searched for the nodespecified by the required XpathTrigger attribute. If the XpathTriggernode is found in the SML document and contains a non-null value, thevalue is is filled into the semantic item specified in the SemanticItemproperty of the answer. For non-existant XpathTrigger in the SMLdocument or null value of XpathTrigger, RunSpeech looks for the next QAto activate.

[1299] After the non-null value of the XpathTrigger node is found,RunSpeech invokes the ClientNormalization function (if specified). TheClientNormalizationFunction returns a text string that reflects theauthor-defined transformation of the value of the XpathTrigger node. Forexample, the author may wish to transform the date “Nov. 17, 2001”returned by the speech platform to “11/17/2001”. Semantic items are usedfor both simple and complex data binding.

[1300] The SML document returned by the speech platform may contain aplatform-specific confidence rating for each XpathTrigger node. Duringanswer processing, RunSpeech compares this confidence rating to thevalue specified in the ConfirmThreshold attribute of the Answer object.Results of the comparison are then used to set the internal confirmedstate of the semantic item. This state information is subsequently usedto determine whether or not an answer requires confirmation from theuser.

[1301] RunSpeech internally marks an answer as needing confirmation ifthe confidence returned with the XpathTrigger is less than or equal tothe value of the ConfirmThreshold attribute. Otherwise RunSpeechinternally marks the semantic item associated with the answer asconfirmed. This internal state information is used during confirmationprocessing.

[1302] Multimodal.

[1303] The Answer object is used in multimodal scenarios by theMultimodal.js script just as it is used by RunSpeech in voice-only(described above) with one exception. In multimodal, platform-specificconfidence ratings are not compared to the ConfirmThreshold attribute ofthe Answer object, therefore internal state information of each answeris not maintained. Confirmation of results is done visually. If anincorrect result is bound to a visual control, the user senses theproblem visually and may then initiate another speech input action tocorrect the error.

[1304] Rendered for Both Multimodal and Voice-Only Modes class Answer :Control { string id{get; set;}; float Reject{get; set;}; floatConfirmThreshold{get; set;}; string XpathTrigger{get; set;}; stringSemanticItem{get; set;}; string ClientNormalizationFunction{get; set;};string StyleReference{get; set;}; }

[1305] 12.1 Answer Properties

[1306] All properties of the Answer object are available to theapplication developer at design time.

[1307] Reject

[1308] Optional. Used in both multimodal and voice-only modes. Specifiesthe rejection threshold for the Answer. Answers having confidence valuesbelow Reject will cause a noReco event to be thrown. If not specified,the value 0 will be used. Legal values are 0-1 and are platformspecific. An exception will be thrown for out of range Reject values.

[1309] Rejected Answers are treated as if they were not present in thereco result to begin with. If, after this processing, no relevantinformation remains (no Answers, ExtraAnswers, Confirms, Commands, orxpathAcceptConfirms/xpathDenyConfirms), an onnoreco event is fired(which mimics exactly the tags version).

[1310] ConfirmThreshold

[1311] Optional. Used in voice-only mode. Specifies the minimumconfidence level of recognition that is necessary to mark this item asconfirmed. If the confidence of the matched item is less than or equalto this threshold, the item is marked as needing confirmation. Legalvalues are 0-1. Default value is 0. An exception will be thrown for outof range ConfirmThreshold values.

[1312] XpathTrigger

[1313] Required for Answers and ExtraAnswers. Optional for Confirms.Used in both multimodal and voice-only modes. Specifies what part of theSML document this answer refers to. It is specified as an XPath on theSML output from recognition. An exception will be thrown if XpathTriggeris not specified for Answers or ExtraAnswers. XpathTrigger must be avalid xml path. An invalid xml path will cause a redirection to thedefault error page during run time.

[1314] For Confirms, if XpathTrigger is not set or set to the emptystring, the confirm won't allow for correction. Yes/no confirmations areenabled when XpathTrigger is used in this way.

[1315] SemanticItem

[1316] Optional. Used in both multimodal and voice-only modes.

[1317] ClientNormalizationFunction

[1318] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side function that will take the matched sml node as aparameter and returns a string that reflects author-specifiednormalization (transformation) of the recognized item. The signature forClientNormalizationFunction is as follows:

[1319] string ClientNormalizationFunction(XMLNode SMLnode, objectSemanticItem)

[1320] where:

[1321] SMLnode is the node specified in the Xpath.

[1322] SemanticItem is the client-side SemanticItem object specified inthe Answer object.

[1323] StyleReference

[1324] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the Answer object willsearch for the named Style control and will use any property valuesspecified on the Style as default values for its own properties.Explicitly set property values by the Answer object will override thoseset on the referenced Style.

[1325] 13 SemanticMap Control

[1326] SemanticMap is a container of SemanticItem objects. classSemanticMap : SpeechControl { SemanticItemCollection SemItems{get;};SemanticItem GetSemanticItem (string name); }

[1327] 13.1 SemanticMap Properties

[1328] SemItems

[1329] A collection of SemanticItem objects.

[1330] 13.2 SemanticMap Methods

[1331] GetSemanticItem

[1332] This is a function that takes the id of a SemanticItem andreturns the SemanticItem object. The signature of GetSemanticItem is:

function GetSemanticItem(string id)

[1333] 14 SemanticItem Object

[1334] The SemanticItem object describes where and when an Answer'srecognition results are written to visual controls on a page. The objectalso keeps track of the current state of Answers, i.e., whether anAnswer has changed or been confirmed. class SemanticItem : Control {string id{get; set;}; string TargetElement{get; set;}; stringTargetAttribute{get; set;}; bool BindOnChanged{get; set;}; stringBindAt{get; set;}; bool AutoPostBack{get; set;}; stringOnClientChanged{get; set;}; string OnClientConfirmed{get; set;};SemanticEventHandler Changed; SemanticEventHandler Confirmed; stringText{get;}; SemanticState State{get;}; StringDictionary Attributes{get;set;}; string StyleReference{get;}; }

[1335] 14.1 SemanticItem Properties

[1336] id

[1337] Required. The programmatic id of this semantic item.

[1338] TargetElement

[1339] Optional. Used in both multimodal and voice-only modes. Specifiesthe id of the visual control to which the recognition results should bewritten. If specified, default binding will occur when the value ischanged or confirmed depending on the value of BindonChanged. Anexception is thrown if TargetElement is the id of multiple controls.

[1340] TargetAttribute

[1341] Optional. Used in both mutimodal and voice-only modes. Specifiesthe property name of the TargetElement to which this answer should bewritten. The default value is null. An exception will be thrown ifTargetElement is specified and TargetAttribute is not specified.

[1342] BindOnChanged

[1343] Optional. Used voice-only mode, ignored in multimodal. Default isfalse. In VoiceOnly mode, BindOnChanged controls when to bindrecognition results to visual elements.

[1344] A value of true causes binding everytime the value of theSemanticItem changes.

[1345] A value of false causes binding only when the SemanticItem hasbeen confirmed.

[1346] BindAt

[1347] Optional. Used in both mutimodal and voice-only modes. Can beomitted or set to “server”. Default is null (omitted). If BindAt is setto “server”, it indicates that the TargetElement/TargetAttribute pairrefers to a server-side control or property. An exception will be thrownwhen BindAt is set to an invalid value.

[1348] If BindAt is “server”, an exception will be thrown if:

[1349] SemanticItem.TargetElement is not a server-side control, or

[1350] SemanticItem.TargetAttribute is not a member of the controlspecified by SemanticItem.TargetElement, or SemanticItem.TargetAttributeis a member of SemanticItem.TargetElement, but is not of type string, or

[1351] SemanticItem.TargetAttribute is a string, but is read-only.

[1352] AutoPostBack

[1353] Optional. Used in both multimodal and voice-only modes. Specifieswhether or not the control posts back to the server when the bindingevent is fired. The binding event can be onChanged or onConfirmed and iscontrolled by the value of BindOnChange. Default is false.

[1354] The state of the voice-only page is maintained automaticallyduring postback. Authors may use the ClientViewState object of RunSpeechto declare and set any additional values they wish to persist acrosspostbacks.

[1355] OnClientChanged

[1356] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side function to be called when the value of the Text propertyof this SemanticItem changes. The function does not return any values.The signature for OnClientChanged is as follows:

[1357] function OnClientChanged(object SemanticItem)

[1358] where SemanticItem is the client-side SemanticItem object.

[1359] Note: If AutoPostBack is set to true, the OnClientChangedfunction is executed before posting back to the server. If the authorwishes to persist any page state across postback, the OnClientChangedfunction is a good place to access the ClientViewState object ofRunSpeech.

[1360] OnClientConfirmed

[1361] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side function to be called when this SemanticItem's [value isconfirmed. The function does not return any values. The signature forOnClientConfirmed is as follows:

[1362] function OnClientConfirmed(object SemanticItem)

[1363] where SemanticItem is the client-side SemanticItem object. Note:If AutoPostBack is set to true, the OnClientConfirmed function isexecuted before posting back to the server. If the author wishes topersist any page state across postback, the OnClientConfirmed functionis a good place to access the ClientViewState object of RunSpeech

[1364] Changed

[1365] Optional. Used in both multimodal and voice-only modes. Specifiesa server-side script function to be executed when the Changed event isfired.

[1366] The signature of a SemanticEventHandler is: (in C#—the signaturewould look slightly different in other languages)

[1367] public delegate void SemanticEventHandler (object sender,SemanticEventArgs e

[1368] where:

[1369] SemanticEventArgs is a class derived from System.EventArgs.

[1370] public class SemanticEventArgs : EventArgs { public string Text{get;}; public StringDictionary Attributes {get;} } Text Returns thevalue that this SemanticItem has been set to. State Returns the state ofthis SemanticItem.

[1371] Confirmed

[1372] Optional. Used in both multimodal and voice-only modes. Specifiesa server-side script function to be executed when the Confirmed event isfired. In multimodal mode, the Confirmed event will be fired immediatelyafter the Changed event.

[1373] The signature of a SemanticEventHandler is: (in C#—the signaturewould look slightly different in other languages)

[1374] public delegate void SemanticEventHandler (object sender,SemanticEventArgs e

[1375] where: SemanticEventArgs is a class derived fromSystem.EventArgs. public class SemanticEventArgs : EventArgs { publicstring Text {get;} public StringDictionary Attributes {get;} } Text Readonly. Returns the value that this SemanticItem has been set to. StateRead only. Returns the state of this SemanticItem.

[1376] Text

[1377] The text value that this SemanticItem has been set to. Default isnull.

[1378] State

[1379] The confirmation state of this SemanticItem. Values of State willbe one of SemanticState.Empty, SemanticState.NeedsConfirmation orSemanticState.Confirmed.

[1380] Attributes

[1381] Optional. Used in both multimodal and voice-only modes. This is acollection of name/value pairs. Attributes is used to pass user definedinformation to the client-side semantic item and back to the server(they are kept synchronized). Attributes may only be setprogrammatically. For example:

[1382] SemanticItem.Attributes [“myvarname”]=“myvarvalue”

[1383] Attributes are not cleared when the SemanticItem is reset by thesystem. If developers wish to reset the attributes, they must do somanually.

[1384] StyleReference

[1385] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the QA SemanticItem objectwill search for the named Style control and will use any property valuesspecified on the Style as default values for its own properties.Explicitly set property values by the SemanticItem object will overridethose set on the referenced Style.

[1386] 14.2 SemanticItem Client-Side Object //Notation doesn't implyprogramming language class SemanticItem { SemanticItem (sco, id,targetElement, targetAttribute, bindOnChanged, bindAtServer,autoPostback, onClientChanged, onClientConfirmed, hiddenFieldID, value,state); SetText (string text, boolean isConfirmed); Confirm( ); Clear(); Empty( ); AddValidator (validator); IsEmpty( ); NeedsConfirmation( );IsConfirmed( ); Encode( ); Object value; //Read only string state;//Read only object attributes; }

[1387] SetText (string text, boolean is Confirmed)

[1388] The SetText method of the client side semantic item object isused to alter the value property. The partmeters are

[1389] string text the string which will become the value of the theSemantic Item

[1390] Boolean is Confirmed determines whether the Semantic Item stateproperty is “confirmed” (if true) or “needs confirmation” if false

[1391] Confirm( )

[1392] This method sets the state property of the Semantic Item propertyto “confirmed.”

[1393] Clear( )

[1394] This method sets the value property of the Semantic Item to NULLand sets the state property to “empty.”

[1395] Empty( )

[1396] AddValidator (Validator)

[1397] IsEmpty( )

[1398] This method checks to see if the state property of the SemanticItem and returns true if it is “empty” and false if it is “needsconfirmation” or “confirmed.”

[1399] NeedsConfirmation( )

[1400] This method checks to see if the state property of the SemanticItem and returns true if it is “needs confirmation” and false if it is“empty” or “confirmed.”

[1401] IsConfirmed( )

[1402] This method checks to see the state property of the Semantic Itemand returns true if it is “confirmed” and false if it is “needsconfirmation” or “empty.”

[1403] Encode( )

[1404] Object Value

[1405] ReadOnly.

[1406] String State

[1407] Read Only.

[1408] Object Attributes

[1409] 14.3 Run-Time Behavior

[1410] As a general rule, the order of execution for every transitionEmpty->NeedsConfirmation or NeedsConfirmation->Confirmed:

[1411] Client-side binding (if needed)

[1412] Client-side event

[1413] If (Autopostback), trigger submit.

[1414] On the server, the order of execution is:

[1415] Server-side binding (if needed)

[1416] Server-side event.

[1417] If the semantic item is programmatically changed in the server,no events (server or client side) will be thrown.

[1418] If (BindOnChanged=false) and (Autopostback=true) and we have bothChanged and Confirmed handlers, both events will be triggered, in order.

[1419] Changed events will be thrown in the server (if needed andhandlers are set) even if the server-side value is the same as theprevious one (didn't change apparently).

[1420] If AutoPostBack is set to true, the controls will cause twopostbacks, synchronized with onChanged, and onConfirmed.

[1421] 15 Prompt Object

[1422] The prompt object contains information on how to play prompts.All the properties defined are read/write properties.

[1423] Rendered for voice-only. Not rendered for multimodal.

[1424] How Prompt Object is Used

[1425] Voice-Only

[1426] The Prompt object is a required element of the QA control.RunSpeech uses the Prompt object to select the appropriate text for theprompt and then play the prompt on the client.

[1427] After RunSpeech determines which QA to activate it eitherincrements or initializes the count attribute of the QA. The countattribute is incremented if the QA being activated was the same QA thatwas active during the last loop through RunSpeech. The count attributeis initialized to count=1 if this is the first time the QA has beenactivated. The QA's count attribute may be used by the script specifiedin the PromptSelectFunction attribute of the Prompt object.

[1428] RunSpeech then sets out to determine which text will besynthesized and played back to the user. The dialog author has theoption of providing a script function for prompt text that is complex tobuild, or simply specifying the prompt text as content of the Promptobject. If RunSpeech detects the existence of an author-specifiedPromptSelectFunction, it passes the text returned from thePromptSelectFunction to the speech platform for synthesis and playbackto the user. Otherwise RunSpeech will pass the text in the content ofthe Prompt object to the speech platform.

[1429] If a serious or fatal error occurs during the synthesis process,the speech platform will fire the onError event. RunSpeech receives thisevent, sets lastCommandOrException to “PromptError” and calls the scriptfunction specified by the OnClientError attribute. The dialog author maythen choose to take appropriate action based upon the type of error thatoccurred.

[1430] After the prompt playback has finished, the speech platform firesthe oncomplete event which is caught by RunSpeech. RunSpeech then looksfor the Reco object associated with the current QA. If a Reco object isfound (i.e., the QA is not just a prompting mechanism), RunSpeechrequests the speech platform to start the recognition process.

[1431] Finally, RunSpeech examines the value of the PlayOnce attributeof the QA containing the Prompt object. If PlayOnce is true, RunSpeechdisables the Prompt object for subsequent activations of this same QA.

[1432] If speech is detected during the playing of the prompt, theplayback of the prompt will be stopped automatically by the platform.RunSpeech catches the onbargein event and halts execution. Since aprompt.OnComplete event may not follow a bargein, RunSpeech resumes whena listen event is received.

[1433] If a bookmark is encountered, Runspeech activates the functionspecified by the OnClientBookmark property.

[1434] Multimodal.

[1435] The Prompt object is not used in multimodal scenarios.

[1436] PromptSelectFunction

[1437] The following three examples illustrate using thePromptSelectionFunction to select or modify prompt text using theparameters available to PromptSelectFunction.

[1438] The first example shows how to use the count parameter to selecta prompt based upon the number of times the QA has been activated. Thescenario is:

[1439] A user calls a menu based service, enters password. Server-sideprocessing determines the user's first and last name and inserts thename information into hidden textboxes (txtFirstName.value andtxtLastName.value) on the welcome page. The welcome page contains a QAwhich prompts the user to enter the desired service. The QA's Promptobject is built to handle 1) the prompt to play for a first time passand 2) the prompt to play if the user fails to select a service at thefirst prompting (i.e., the same QA is activated after a timeoutexpires). <speech:QA id=“welcomeQA” runat=“server”> <Promptid=“welcomePrompt” PromptSelectFunction=“SelectWelcomePrompt” /> <Recoid=“welcomeReco” mode=“automatic”> <Grammars> <speech:grammarid=“welcomeGrammar” src=“http://mysite/services.xml” runat=“server” /></Grammars> </Reco> <Answers> <speech:answer id=“servicesAnswer”SemanticItem = “siService” runat=“server” /> </Answers> </speech:QA><script> function SelectWelcomePrompt(lastCommandOrException, count,assocArray) { switch(count) { case 1: return “Welcome to Acme Services<SALT:value>txtFirstName.value</SALT:value>. Please select the Email,Calendar or Stock service.”; case 2: return “Welcome Please select theEmail, Calendar or Stock service.”; case 3: return “Welcome Pleaseselect the Email, Calendar or Stock service.”; default: return “I'msorry <SALT:value>txtFirstName.value</SALT:value>, we're havingcommunication problems. Good Bye.”; } } </script>

[1440] The next example shows how to use the lastCommandOrExceptionparameter to modify a prompt based upon a event previous event in thedialog. The scenario is:

[1441] A user is asked to provide the name of a departing airport. TheQA contains a Prompt object that is built to handle the initial prompt,a prompt if the user asks for help, and a prompt if the user fails torespond (i.e. a timeout occurs). <speech:qa id=“qa1” runat=“server”><Prompt id=“prompt1” PromptSelectFunction=“SelectDepartingAirport” /><Reco id=“reco1” mode=“automatic”> <Grammars> <speech:grammar id=“gram1”src=“http://mysite/NYAirport.xml” runat=“server” /> </Grammars> </Reco><Answers> <speech:answer id=“ans1” SemanticItem=“siAns1” runat=“server”/> </Answers> </speech:qa> < speech:command id=“command1” runat=“server”XpathTrigger=“/sml/help” scope=“qa1” type=“HELP”> <Grammarsrc=“http://mysite/help.xml” runat =“server” /> </speech:command><script> function SelectDepartingAirport(lastCommandOrException, count,assocArray) { if (count= =1) return “From which airport would you liketo depart?”; switch(lastCommandOrException) { case “SILENCE”: return“I'm sorry I didn't catch that. From which airport would you like todepart?”; case “HELP”: return “You may choose from Kennedy, La Guardia,or that little airport on Long Island. From which airport would you liketo depart?”; default return “What we have here is a failure tocommunicate. Good bye.”; } } </script>

[1442] The last example shows how to use the assocArray parameter tomodify a prompt during a confirmation pass. The scenario is:

[1443] The user is asked to provide itinerary details: departing andarrival cities and travel date. The QA is constructed to implicitlyconfirm the departing and arrival city information and explicitlyconfirm the travel date. The Prompt object is built to provideappropriate prompting of items requiring confirmation. <speech:qaid=“qa1” runat=“server”> <Prompt id=“prompt1” InLinePrompt=“What is yourdesired itinerary?”></Prompt> <Reco id=“reco1” mode=“Automatic”><Grammars> <speech:grammar id=“grm1” src=“http://mysite/city_date.xml”runat=“server” /> </Grammars> </Reco> <Answers> <speech:answer id=“A1”XpathTrigger=“/sml/departCity” SemanticItem=“siTb1”ConfirmThreshold=“0.90” runat=“ server” /> <speech:answer id=“A2”XpathTrigger=“/sml/arrivalCity” SemanticItem=“siTb2”ConfirmThreshold=“0.90” runat=“ server” /> <speech:answer id=“A3”XpathTrigger=“/sml/departDate” SemanticItem=“siTb3”ConfirmThreshold=“1.00” runat=“server” /> </Answers> </speech:qa><speech:qa id=“qa2” runat=“server” XpathDenyConfirms=“/sml/deny”XpathAcceptConfirms=“/sml/accept”> <Prompt id=“prompt2” PromptSelectFunction=“myPrompt Function” /> <Reco id=“reco2” mode=“automatic”><Grammars> <speech:grammar id=“grm2”src=“http://mysite/cityANDdateANDyes_no.xml” runat=“server”/></Grammars> </Reco> <Confirms> <speech:answer id=“conf1”XpathTrigger=“/sml/departCity” SemanticItem=“siTbl”ConfirmThreshold=“0.90” runat=“server” /> <speech:answer id=“conf2”XpathTrigger=“/sml/arrivalCity” SemanticItem=“siTb2 ”ConfirmThreshold=“0.90” runat=“server” /> <speech:answer id=“conf3”XpathTrigger=“/sml/departDate” SemanticItem=“siTb2”ConfirmThreshold=“1.00” runat=“server” /> </Confirms> </speech:qa><script> function myPromptFunction(lastCommandOrException, count,assocArray) { var promptext = “Did you say ”; if (assocArray[“siTb1”]!=null && assocArray[“siTb1”] !=“ ”) { promptText += “from” +assocArray[“siTb1”]; return promptText; } if (assocArray[“siTb2”] !=null&& assocArray[“siTb2”] !=“ ”) { promptText += “to” +assocArray[“siTb2”]; return promptText; } if (assocArray[“siTb1”] !=null&& assocArray[“siTb3”] !=“ ”) { promptText += “on” +assocArray[“siTb3”]; return promptText; } } </script> class Prompt :Control { string id{get; set;}; string type{get; set;}; boolprefetch{get; set;}; string lang{get; set;}; bool bargein{get; set;};string src{get; set;}; string PromptSelectFunction{get; set;}; stringOnClientBookmark{get; set;}; string OnClientError{get; set;}; stringInlinePrompt{get; set;}; string StyleReference{get; set;};ParamCollection Params{get; set:}; }

[1444] 15.1 Prompt Properties

[1445] All properties of the Prompt object are available at design time.

[1446] Type

[1447] Optional. Only used in voice-only mode. The mime-typecorresponding to the speech output format used. No default value. Thetype attribute mirrors the type attribute on the SALT Prompt object.

[1448] Prefetch

[1449] Optional. Only used in voice-only mode. Flag to indicate whetherthe prompt should be immediately synthesized and cached at browser whenthe page is loaded. Default value is false. The prefetch attributemirrors the prefetch attribute on the SALT Prompt object.

[1450] Lang

[1451] Optional. Only used in voice-only mode. Specifies the language ofthe prompt content. The value of this attribute follows the RFC xml:langdefinition. Example: lang=“en-us” denotes US English. No default value.If specified, this over-rides the value set in the Web.config file. Thelang attribute mirrors the lang attribute on the SALT Prompt object.

[1452] Bargein

[1453] Optional. Used only for voice-only mode. Flag that indicateswhether or not the speech platform is responsible for stopping promptplayback when speech or DTMF input is detected. If true, the platformwill stop the prompt in response to input and flush the prompt queue. Iffalse, the platform will take no default action. If unspecified, defaultto true.

[1454] PromptSelectFunction

[1455] Optional. Only used in voice-only mode. Specifies a client-sidefunction that allows authors to select and/or modify a prompt stringprior to playback. The function returns the prompt string.PromptSelectFunction is called once the QA has been activated and beforethe prompt playback begins. If PromptSelectFunction is specified, srcand InLinePrompt are ignored.

[1456] The signature for PromptSelectFunction is as follows:

[1457] String PromptSelectFunction(string lastCommandOrException, intCount, object SemanticItemList)

[1458] where:

[1459] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”).

[1460] Count is the number of times the QA has been activatedconsecutively. Count starts at 1 and has no limit.

[1461] SemanticItemList For voice-only mode, SemanticItemList is anassociative array that maps semantic item id to semantic item objects.For multimodal, SemanticItemList will be null.

[1462] If the PromptSelectFunction is being called from within a Promptobject specified by a CustomValidator control, the SemanticItemList willcontain the SemanticItem being validated.

[1463] If the PromptSelectFunction is being called from within a Promptobject specified by a CompareValidator control, the SemanticItemListwill contain the SemanticItem being validated and (if specified) theSemanticItem to which it is being compared.

[1464] OnClientBookmark

[1465] Optional. Only used in voice-only mode. Specifies a client sidefunction which is called when a Bookmark is reached in the prompt textduring playback. The function does not return a value. The signature forOnClientBookmark is as follows:

[1466] function OnClientBookmark( )

[1467] OnClientError

[1468] Optional. Only used in voice-only mode. Specifies a client sidefunction which is called in response to an error event in the client.Error events are generated from the event object. The function returns aBoolean value. The RunSpeech algorithm will continue executing if anOnClientError script returns true. The RunSpeech algorithm will navigateto the default error page specified in the Web.config file if anOnClientError script returns false or if an error occurs and theOnClientError function is not specified. When navigating to the errorpage, both status and description will be passed in the query string.For example, if the error page is http://myErrorPage, we will navigateto http://myErrorPage?status=X&description=Y (where X is the status codeassociated with the error and Y is the description of that error givenin the Speech Tags Specification. The signature for OnClientError is asfollows:

[1469] bool OnClientError(int status)

[1470] where status is the code returned in the event object.

[1471] Note: For the SDK Beta release, it is advisable to specify adefault error page using the syntax described in Section 5 GlobalApplication Settings

[1472] InlinePrompt

[1473] Optional. Only used in voice-only mode. The text of th prompt tobe played. It may contain further markup, as in TTS renderinginformation, or <value>elements. If a PromptSelectFunction function isspecified, the InlinePrompt is ignored.

[1474] StyleReference

[1475] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the Prompt object willsearch for the named Style control and will use any property valuesspecified on the Style as default values for its own properties.Explicitly set property values by the Prompt object will override thoseset on the referenced Style.

[1476] Params

[1477] Optional. An collection of param objects that specify additional,non-standard configuration parameter values to the speech platform. Theexact nature of the configurative parameters will differ according tothe proprietary platform used. Values of parameters may be specified inan XML namespace, in order to allow complex or structured values. Anexception will be thrown if the Params collection contains a non-paramobject.

[1478] For example, the following syntax could be used to specify thelocation of a remote prompt engine for distributed architectures:<Params> <speech:param name=“promptServer”runat=“server”>//myplatform/promptServer</speech:param> </Params>

[1479] 16 Reco Object

[1480] Reco is rendered for both multimodal and voice-only modes.

[1481] The Reco object is used to specify speech input resources andfeatures as well as provide for the management of cases when vaildrecognition results are not returned.

[1482] How Reco object is used.

[1483] Voice-Only

[1484] During the processing of the Prompt object, RunSpeech determineswhether or not the currently active QA contains a Reco object. If itdoes, RunSpeech asks the speech platform to start the recognitionprocess using the grammar specified by the Reco's Grammar object.RunSpeech calls the function specified by OnClientListening immediatelyafter activating the Reco's underlying <listen> tag. The recognitionprocess is stopped depending on the value of the mode attribute.RunSpeech processes successful recognition results using informationspecified in the Answer object.

[1485] RunSpeech uses the Reco object to handle the situations when thespeech platform is not able to return valid recognition results, i.e.,speech platform errors, timeouts, silence, or inability of the speechplatform to recognize an utterance. In each of these cases, RunSpeechcalls the appropriate handler (if specified) after setting the value ofthe lastCommandOrException attribute.

[1486] Multimodal

[1487] The Reco object is used by the Multimodal.js client-side scriptjust as it is used by the RunSpeech voice-only client-side script (asdescribed above) with one exception, starting/stopping the recognitionprocess. Multimodal scenarios do not require speech output as amechanism to prompt the user for input. In fact, prompting in speechcontrols is not available in multimodal scenarios as the Prompt objectis not rendered to the client. Therefore, an alternate mechanism isrequired to start the recognition process.

[1488] Multimodal.js uses the event specified in theStartElement/StartEvent attributes to start the recognition process. Thefunction specified by the OnClientListening attribute is called afterthe recognition process has started. Multimodal.js uses the combinationof the StopEvent and mode attributes to stop the recognition process.class Reco : Control { string id{get; set;}; string StartElement{get;set;}; string StartEvent{get; set;}; string StopElement{get; set;};string StopEvent{get; set;}; int initialTimeout{get; set;}; intbabbleTimeout{get; set;}; int maxTimeout{get; set;}; int endSilence{get;set;}; float reject{get; set;}; string mode{get; set;}; string lang{get;set;}; string GrammarSelectFunction{get; set;}; stringOnClientSpeechDetected{get; set;}; string OnClientSilence{get; set;};string OnClientNoReco{get; set;}; string OnClientError{get; set;};string StyleReference{get; set;}; GrammarCollection Grammars{get; set;};ParamCollection Params{get; set;}; Control record{get; set;}; }

[1489] 16.1 Reco Properties

[1490] All properties are available at design time.

[1491] Start Element

[1492] Optional, but must be present if StartElement is specified.

[1493] Used only in multimodal mode. Specifies the name of the GUIelement with which the start of the Reco is associated. See StartEvent.No default value.

[1494] StartEvent

[1495] Optional, but must be present if StartElement is specified. Onlyused in multimodal mode. Specifies the name of the event that willactivate (start) the underlying client-side Reco object. See startElement No default value.

[1496] Start Element

[1497] Optional, but must be present if StopElement is specified. Usedonly in multimodal mode. Specifies the name of the GUI element withwhich the stop of the Reco is associated. See StopEvent. No defaultValue

[1498] StopEvent

[1499] Optional, but must be present if StartElement is specified. Onlyused in multimodal mode. Specifies the name of the event that will stopthe underlying client-side Reco object. See stop Element. No defaultvalue.

[1500] StartEvent and StopEvent will be used in multi-modalapplications, typically for tap-and-talk interactions. E.g.StartEvent=Button1.onmousedown, StopEvent=Button1.onmouseup.

[1501] StartEvent and StopEvent are allowed to be the same (click tostart, click to stop). However, it is the author's responsibility tode-activate Recos before starting new ones in the case when the end userfires two StartEvents in succession (e.g., click on one control to starta reco then click on a different control to start another reco beforestopping first reco).

[1502] Note: IE requires exact cases when running Jscript. Therefore,the the case for event values specified in the StartEvent and StopEventattributes must be exactly as those events are defined. For example, theonmouseup and onmousedown events are specified in all lower caseletters.

[1503] Note: StartEvent and StopEvent are not rendered for voice-onlymode.

[1504] InitialTimeout

[1505] Optional. Used in both multimodal and voice-only modes. The maxtime in milliseconds between start of recognition and the detection ofspeech. This value is passed to the recognition platform, and ifexceeded, an onSilence event will be thrown from the recognitionplatform. If not specified, the speech platform will use a defaultvalue. No default value. An exception will be thrown for non-integer ornegative integer value.

[1506] Note: The sum of the initialTimeout and babbleTimeout valuesshould be smaller or equal to the global maxTimeout attribute or theReco attribute maxTimeout (see below) if it is set.

[1507] Note: The initialTimeout attribute mirrors the initialTimeoutattribute on the SALT Reco object.

[1508] BabbleTimeout

[1509] Optional. Used in both multimodal and voice-only modes. Optional.The maximum period of time in milliseconds for an utterance. For recosin automatic and single mode, this applies to the period between speechdetection and the speech endpoint or stop call. For recos in ‘multiple’mode, this timeout applies to the period between speech detection andeach phrase recognition—i.e. the period is restarted after each returnof results or other event. If exceeded, the onnoreco event is thrownwith status code −15. This can be used to control when the recognizershould stop processing excessive audio. For automatic mode listens, thiswill happen for exceptionally long utterances, for example, or whenbackground noise is mistakenly interpreted as continuous speech. Forsingle mode listens, this may happen if the user keeps the audio streamopen for an excessive amount of time (eg by holding down the stylus intap-and-talk). If the attribute is not specified, the speech platformwill use a default value.

[1510] No default value. An exception will be thrown for non-integer ornegative integer values.

[1511] Note: The sum of the initialTimeout and babbleTimeout valuesshould be smaller or equal to the global maxTimeout attribute or theReco attribute maxTimeout (see below) if it is set.

[1512] Note: The babbleTimeout attribute mirrors the babbleTimeoutattribute on the SALT Reco object.

[1513] MaxTimeout

[1514] Optional. Used in both multimodal and voice-only modes. Theperiod of time in milliseconds between recognition start and resultsreturned to the browser. If exceeded, an OnError event is thrown by thebrowser—this provides for network or recognizer failure in distributedenvironments. For Recos in “multiple” mode, as with babbleTimeout, theperiod is restarted after the return of each recognition or other event.No default value. An exception will be thrown for non-integer ornegative integer values.

[1515] Note: maxTimeout should be greater than or equal to the sum ofinitialTimeout and babbleTimeout. If specified, the value of thisattribute over-rides the value of maxTimeout set in the Web.config file.No default value.

[1516] Note: The maxTimeout attribute mirrors the maxTimeout attributeon the SALT Reco object.

[1517] EndSilence

[1518] Optional. Used in both multimodal and voice-only modes. For Recoobjects in “automatic” mode, the period of silence in milliseconds afterthe end of an utterance which must be free of speech after which therecognition results are returned. Ignored for Recos of modes other than“automatic”. If not specified, defaults to platform internal value. Anexception will be thrown for non-integer or negative integer values.

[1519] Reject

[1520] Optional. Used in both multimodal and voice-only modes. Specifiesthe rejection threshold, below which the platform will throw the noRecoevent. If not specified, the speech platform will use an internaldefault value. Legal values are 0-1 and are platform specific. Anexception will be thrown for out of range reject values. Default is 0.

[1521] Lang

[1522] Optional. Used in both multimodal and voice-only modes. Specifiesthe language of the speech recognition engine. The value of thisattribute follows the RFC xml:lang definition. Example: lang=“en-us”denotes US English. No default value. This over-rides the global settingin the Web.config file. The lang attribute mirrors the lang attribute onthe SALT Reco object.

[1523] Mode

[1524] Optional. Used in both multimodal and voice-only modes. Specifiesthe recognition mode to be followed. Default is “automatic”. Legalvalues are “automatic”, “single”, and “multiple”.

[1525] Mode=“automatic”

[1526] Used for recognitions in telephony scenarios. The speech platformitself (not the application) is in control of when to stop therecognition process. Mode=“automatic” is the only mode setting thatworks in voice-only, other modes will be ignored and “automatic” will beused.

[1527] Mode=“single”

[1528] Used for multimodal (tap-to-talk) scenarios. The return of arecognition result is under the control of an explicit call to stop therecognition process by the application. However, exceeding babbleTimeoutor maxTimeout will stop recognition. Mode=“single” is ignored forvoice-only.

[1529] Mode=“multiple”

[1530] Used for “open-microphone” or dictation scenarios. Recognitionresults are returned at intervals until the application makes anexplicit call to stop the recognition process (or babbleTimeout ormaxTimeout periods are exceeded). Multiple mode recos are not supportedin voice-only mode dialogs. If the browser is a voice-only browser andreco mode is set to “multiple”, an exception will be thrown at rendertime. Mode=“multiple” is ignored for voice-only.

[1531] GrammarSelectFunction

[1532] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script that will be called prior to starting therecognition process. The script is written by the dialog author and maybe used to select or modify the Grammar objects associated with the Recoobject. The script may also be used to adjust speech recognitionfeatures or confidence/rejection thresholds. The GrammarSelectFunctionfunction does not return values. The signature for GrammarSelectFunctionis as follows:

[1533] function GrammarSelectFunction(object recoObj, stringlastCommandOrException, int Count, object SemanticItemList)

[1534] where:

[1535] recoObj is the Reco object about to start.

[1536] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”). For multimodal dialogs,lastCommandOrException will be an empty string Count is the number oftimes the QA containing the Reco object has been activatedconsecutively. Count starts at 1 and has no limit. For multimodaldialogs, count will be zero.

[1537] SemanticItemList For voice-only mode, SemanticItemList is anassociative array that maps semantic item id to semantic item objects.For multimodal dialogs, SemanticItemList will be null.

[1538] OnClientSpeechDetected

[1539] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script function that will be called when theonspeechdetected event is fired by the speech recognition platform onthe detection of speech. Determining the actual time of firing is leftto the platform (which may be configured on certain platforms using the<param> element. This may be anywhere between simple energy detection(early) or complete phrase or semantic value recognition (late). Thisevent also triggers onbargein on a prompt which is in play and maydisable the initial timeout of a started dtmf object. This function canbe used in multimodal scenarios, for example, to generate a graphicalindication that recognition is occurring, or in voice-only scenarios toenable fine control over other processes underway during recognition.The function does not return any values. The signature forOnClientSpeechDetected is as follows:

[1540] function OnClientSpeechDetected( )

[1541] If a Dtmf object is active when the OnClientSpeechDetectedfunction is called, the timeouts of the Dtmf object will be disabled.

[1542] OnClientSilence

[1543] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script that will be called after detecting silence (inresponse to SALT reco onSilence event). The function does not return anyvalues. The signature for OnClientSilence is as follows:

[1544] function OnClientSilence(int status)

[1545] where status is the code returned in the event object.

[1546] If a Dtmf object is active when the OnClientSilence function iscalled, the Dtmf object will be stopped.

[1547] OnClientNoReco

[1548] Optional. Used in both multimodal and voice-only modes. Specifiesa client-side script that will be called after detecting no recognition(in response to SALT reco onNoReco event). The function does not returnany values. The signature for OnClientNoReco is as follows:

[1549] function OnClientNoReco(int status)

[1550] where status is the code returned in the event object.

[1551] If a Dtmf object is active when the OnClientNoReco function iscalled, the Dtmf object will be stopped.

[1552] OnClientError

[1553] Optional. Used in both multimodal and voice-only modes. Specifiesa client side function which is called in response to an error event inthe client. Error events are generated from the event object. Thefunction returns a boolean value. The RunSpeech algorithm will continueexecuting if an OnClientError script returns true. The RunSpeechalgorithm will navigate to the default error page specified in theWeb.config file if an OnClientError script returns false or if an erroroccurs and the OnClientError function is not specified. When navigatingto the error page, both status and description will be passed in thequery string. For example, if the error page is http://myErrorPage, wewill navigate to http://myErrorPage?status=X&description=Y (where X isthe status code associated with the error and Y is the description ofthat error given in the Speech Tags Specification. The signature forOnClientError is as follows:

[1554] bool OnClientError(int status)

[1555] where status is the code returned in the event object.

[1556] Note: the return value of OnClientError is ignored in multimodalmode.

[1557] If a Dtmf object is active when the OnClientError function iscalled, the Dtmf object will be stopped.

[1558] StyleReference

[1559] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the Reco object will searchfor the named Style control and will use any property values specifiedon the Style as default values for its own properties. Explicitly setproperty values by the Reco object will override those set on thereferenced Style.

[1560] Grammars

[1561] Optional. An array of grammar objects as specified below. Anexception will be thrown if a Grammars collection contains a non-grammarobject.

[1562] Params

[1563] Optional. Used in both multimodal and voice-only modes. Ancollection of param objects that specify additional, non-standardconfiguration parameter values to the speech platform. The exact natureof the configurative parameters will differ according to the proprietaryplatform used. Values of parameters may be specified in an XMLnamespace, in order to allow complex or structured values. An exceptionwill be thrown if the Params collection contains a non-param object.

[1564] For example, the following syntax could be used to specify thelocation of a remote speech recognition server for distributedarchitectures: <Params> <speech:param name=“recoServer”runat=“server”>//myplatform/recoServer</speech:param> </Params>

[1565] Record

[1566] Optional. Used in both multimodal and voice-only modes. Therecord object is used for recording audio input from the user. Recordingmay be used in addition to recognition or in place of it, according tothe abilities of the platform and its profile. Only one record object ispermitted in a single <reco>.

[1567] 17 Grammar Object

[1568] The grammar object contains information on the selection andcontent of grammars, and the means for processing recognition results.All the properties defined are read/write properties. class Grammar :Control { string id{get; set;}; string type{get; set;}; string lang{get;set;}; string src{get; set;}; string InLineGrammar{get; set;}; stringStyleReference{get; set;}; }

[1569] 17.1 Grammar Properties

[1570] Grammar is rendered for both multimodal and voice-only modes. Allproperties are available at design time and run time.

[1571] Type

[1572] Optional. Used in both multimodal and voice-only modes. Themime-type corresponding to the grammar format used. No default value.The type attribute mirrors the type attribute on the SALT Grammarobject.

[1573] Lang

[1574] Optional. Used in both multimodal and voice-only modes. Stringindicating which language the grammar refers to. The value of thisattribute follows the RFC xml:lang definition. Example: lang=“en-us”denotes US English. No default value. Over-rides the global value set inthe Web.config file. The lang attribute mirrors the lang attribute onthe SALT Grammar object.

[1575] Src

[1576] Optional. Used in both multimodal and voice-only modes. Specifiesthe URI of the grammar to load. If an inline grammar and src are bothspecified the inline grammar takes precendence and src is ignored. Thesrc attribute mirrors the src attribute on the SALT Grammar object. Anexception will be thrown if one of src or InlineGrammar is notspecified.

[1577] InlineGrammar

[1578] Optional. Used in both multimodal and voice-only modesInlineGrammar accesses the text of the grammar specified inline. IfInlineGrammar and src are both specified, InlineGrammar takesprecendence and src is ignored. An exception will be thrown if one ofsrc or InlineGrammar is not specified.

[1579] Inline grammars must be HTML Encoded, they are HTML encoded whensent down to the server. Authors must use &gt for > and &lt for < andadhere to all other HTML Encoding standards. It is recommended thatauthors use the property builder in DET, which will handle the HTMLencoding automatically.

[1580] StyleReference

[1581] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the Grammar object willsearch for the named Style control and will use any property valuesspecified on the Style as default values for its own properties.Explicitly set property values by the Grammar object will override thoseset on the referenced Style.

[1582] 18 Dtmf Object

[1583] Dtmf may be used by QA controls in telephony applications. TheDtmf object essentially applies a different modality of grammar (akeypad input grammar rather than a speech input grammar) to the samequestion. class Dtmf : Control { string id{get; set;}; boolpreflush{get; set;}; int initialTimeOut{get; set;}; intinterDigitTimeOut{get; set;}; int endSilence{get; set;}; stringOnClientSilence{get; set;}; string OnClientKeyPress{get; set;}; stringOnClientError{get; set;}; string StyleReference{get; set;};ParamCollection Params{get; set;}; GrammarCollection Grammars{get;set;}; }

[1584] 18.1 Dtmf Properties

[1585] All properties are available at design time.

[1586] Preflush

[1587] Optional. Flag to indicate whether to automatically flush theDTMF buffer on the underlying telephony interface card beforeactivation. Default is “false” (to enable type-ahead functionality). Thepreflush attribute mirrors the preflush attribute on the SALT DTMFobject.

[1588] InitialTimeOut

[1589] Optional. The number of milliseconds to wait for receiving thefirst key press before raising a timeout event. If this timeout occursthe DTMF collection end automatically. If unspecified, initialTimeoutdefaults to a telephony platform internal setting. An exception isthrown if initialTimeout is a negative value. The initialTimeoutattribute mirrors the initialTimeout attribute on the SALT DTMF object.

[1590] InterdigitTimeOut

[1591] Optional. The timeout period in milliseconds for adjacent DTMFpresses before raising a timeout event. If this timeout occurs the DTMFcollection ends automatically. If unspecified, interdigitTimeoutdefaults to a telephony platform internal setting. An exception isthrown if initialTimeout is a negative value. The interdigitTimeoutattribute mirrors the interdigitTimeout attribute on the SALT DTMFobject.

[1592] EndSilence

[1593] Optional. The timeout period in milliseconds when input matches acomplete path through the grammar but further input is still possible.This timeout specifies the period of time in which further input ispermitted after the complete match. Once exceeded, onreco is thrown.(For a complete grammar match where further input is not possible, theendsilence period is not required, and onreco is thrown immediately.) Ifthis attribute is not supported directly by a platform, or unspecifiedin the application, the value of endsilence defaults to that used forinterdigittimeout. An exception is thrown if endSilence is a negativevalue.

[1594] OnClientSilence

[1595] Optional. Specifies a client-side script function to be called ifthere is no DTMF key press before initialTimeout expires. The platformhalts DTMF collection automatically. The QA treats this as a silence.The function returns no values. The signature for OnClientSilence is asfollows:

[1596] function OnClientSilence( )

[1597] If a Reco object is active when the OnClientSilence function iscalled, the Reco object will be stopped.

[1598] OnClientKeyPress

[1599] Optional. Specifies a client-side script function that is calledon every pressing of a DTMF key which is legal according to the inputgrammar. If a prompt is in playback, the onkeypress event will triggerthe onbargein event on the prompt (and cease its playback if theprompt's bargein attribute is set to true). If a Reco object is active,the first onkeypress event will disable the timeouts of the Reco object.

[1600] OnClientError

[1601] Optional. Specifies a client-side function which is called inresponse to a serious or fatal error with the DTMFcollection/recognition process. Error events are generated from theevent object. The function returns a boolean value. The RunSpeechalgorithm will continue executing if an OnClientError script returnstrue. The RunSpeech algorithm will navigate to the default error pagespecified in the Web.config file if an OnClientError script returnsfalse or if an error occurs and the OnClientError function is notspecified. When navigating to the error page, both status anddescription will be passed in the query string. For example, if theerror page is http://myErrorPage, we will navigate tohttp://myErrorPage?status=X&description=Y (where X is the status codeassociated with the error and Y is the description of that error givenin the Speech Tags Specification. The signature for OnClientError is asfollows:

[1602] bool OnClientError(int status)

[1603] where status is the code returned in the event object.

[1604] If a Reco object is active when the OnClientError function iscalled, the Reco object will be stopped.

[1605] OnClientNoReco

[1606] Optional. Specifies a client side function which is called inresponse to a failure to recognize by the DTMF collection/recognitionprocess. This is most lokely to occur when the input detected does notmatch an path through the active grammars. The function does not need toreturn a value. The prototype for the function is:

[1607] OnClientNoReco(int status)

[1608] Where status is the code returned the in the event object.

[1609] StyleReference

[1610] Optional. Used in both multimodal and voice-only modes. Specifiesthe name of a Style object. At render time, the Dtmf object will searchfor the named Style control and will use any property values specifiedon the Style as default values for its own properties. Explicitly setproperty values by the Dtmf object will override those set on thereferenced Style.

[1611] Grammars

[1612] Optional. An array of grammar objects.

[1613] Params

[1614] An collection of param objects that specify additional,non-standard configuration parameter values to the speech platform. Theexact nature of the configurative parameters will differ according tothe proprietary platform used. Values of parameters may be specified inan XML namespace, in order to allow complex or structured values. Anexception will be thrown if the Params collection contains a non-paramobject.

[1615] For example, the following syntax shows how to specify aparameter on particular DTMF platform.

[1616] <Params>

[1617] <speech:param name=“myDTMFParam” runat=“server”>myDTMFValue</speech:param>

[1618] </Params>

[1619] 19 Param Object

[1620] The param object allows authors to specify the names and valuesof additional, non-standard configuration parameters to the speechplatform. The exact nature of the configurative parameters will differaccording to the proprietary platform used. Values of parameters may bespecified in an XML namespace, in order to allow complex or structuredvalues. class param : Control { string name{get; set;}; stringValue{get; set;}; }

[1621] Note that the value of a param object is specified between theparam tags.

[1622] 19.1 Param Properties

[1623] Name

[1624] Required. The name of the parameter to be configured. Anexception will be thrown for <param> elements that do not contain thename attribute.

[1625] Value

[1626] Optional. The value which will be assigned to the namedparameter.

[1627] 20 Record Object

[1628] The record object is used to record audio input from the user.Recording may be used in addition to recognition or in place of it,according to the abilities of the platform and its profile. class record: Control { bool enabled{get; set;}; string type{get; set;}; boolbeep{get; set;}; }

[1629] 20.1 Record Properties

[1630] Enabled

[1631] Optional. Flag to indicate whether or not to record the userinput. Defaults to “false”.

[1632] Type

[1633] Optional. MIME type of the recording. MIME types can be specifiedsuch as “audio/wav” for WAV (RIFF header) 8 kHz 8-bit mono mu-law [PCM]single channel or “audio/basic” for Raw (headerless) 8 kHz 8-bit monomu-law [PCM] single channel. If unspecified, defaults to G.711 wavefile.

[1634] Beep

[1635] Optional. Boolean value, if true, the platform will play a beepbefore recording begins. Defaults to false.

[1636] 21 Call Control

[1637] All call-related server-side controls deal with a single deviceand a single active call at any given time. If the dialog author needsto monitor more than one device or handle more than one active call, thecustom SmexMessage can be used and the author will have to handle CSTAmessages.

[1638] All call control controls are only used in voice-only mode.

[1639] The SpeechControls.dll will implement a support class (CallInfo),a base class (SmexMessageBase), and the following WebControls:

[1640] SmexMessage

[1641] for custom/advanced CSTA messages, and messages to any non-CSTA<smex> elements by specifying a client side <smex> element

[1642] TransferCall

[1643] for CSTA SingleStepTransfer service

[1644] MakeCall

[1645] for CSTA MakeCall service

[1646] DisconnectCall

[1647] for CSTA ClearConnection service

[1648] AnswerCall

[1649] for CSTA AnswerCall service

[1650] 21.1 Common Classes

[1651] 21.1.1 CallInfo class CallInfo { string MonitorCrossRefId {get;};string DeviceId {get;}; string CallId {get;}; string CallingDevice{get;}; string CalledDevice {get;}; }

[1652] 21.1.1.1 CallInfo Properties

[1653] MonitorCrossRefId: The id returned by the start page'sMonitorStart.

[1654] DeviceId: The device id for the current active call.

[1655] CallId: The call id for the current active call. These propertiescan be used in the custom SmexMessage object to form the correct CSTAxml message on the web server side.

[1656] CallingDevice: This represents the calling device informationprovided by the network (ANI, for example). This information will alwaysremain with the call and will never change (unlike the callingDevice).

[1657] CalledDevice: This represents the called device informationprovided by the network (DNIS, for example). This information willalways remain with the call and will never change (unlike thecalledDevice).

[1658] 21.1.2 SmexMessageBase

[1659] This is an internal class. Authors that need to create newcall-control controls should derive from SmexMessage.

[1660] internal class abstract SmexMessageBase { string ID {get; set };int Timer (get; set}; bool AutoPostback {get; set}; stringClientActivationFunction {get; set}); string OnClientError {get, set};string OnClientTimeout {get; set}; CallInfo Current Call {get; } }

[1661] 21.1.2.1 SmexMessageBase Properties

[1662] ID: ASP.NET control ids.

[1663] SpeechIndex: Same as for other speech controls controls. Thisindex controls the order of the object within RunSpeech. Default 0,meaning source order after all non-zero indexed speech objects.

[1664] Timer: Number in milliseconds indicating the time span before atimeout event will be triggered. This set on the client side <smex>object before the CSTA message is sent. The default is 0, meaning notimeout. An exception will be thrown for neagtive values of Timer.

[1665] AutoPostback: Whether to cause a postback when the object's eventis fired. Default is false.

[1666] ClientActivationFunction: The client side function called byRunSpeech to determine whether an object is active. When not specified,the object is considered active only once (the PlayOnce behavior).ClientActivationFunction returns a bool to indicate whether theassociated object should be active (true) or not (false). The signaturefor ClientActivationFunction is:

[1667] function ClientActivationFunction(object sender)

[1668] where sender is the current object

[1669] OnClientError: Optional. Default is false when not specified. Theclient side function called when <smex> fires the onerror event.OnClientError returns a bool—true to continue RunSpeech and false to goto the error page. The signature for OnClientError is:

[1670] function OnClientError(object sender, int status)

[1671] where

[1672] sender is the current object, and

[1673] status is the value of the object's status property.

[1674] OnClientTimeout: Optional. Default is true when not specified.The client side function called when <smex> fires the ontimeout event.OnClientTimeout returns a bool—true to continue RunSpeech and false togo to the error page. The signature for OnClientTimeout is:

[1675] function OnClientTimeout(object sender

[1676] where

[1677] sender is the current object.

[1678] CurrentCall: Returns the current active call object.

[1679] 21.2 Server-Side Classes

[1680] 21.2.1 SmexMessage

[1681] This is a generic class for sending raw CSTA messages andreceiving CSTA events.

[1682] Since the number and types of events generated by this message isunknown, the author needs to be careful about when RunSpeech cancontinue.

[1683] RunSpeech will be paused just before calling author'sOnClientBeforeSend function when the message is about to be sent.

[1684] If OnClientReceive is not specified, RunSpeech will resume whenany smex event is received after message is sent.

[1685] If OnClientReceive is specified, the author returns true toindicate RunSpeech can resume after receiving the expected event.

[1686] RunSpeech will resume after Error or Timeout happens.

[1687] The Smex Timer will be set to the given value before the messageis sent and back to zero right before RunSpeech resumes.

[1688] When an unexpected smex event arrives, i.e. when the currentactive object in RunSpeech is not a call related object, the smex eventis ignored.

[1689] When AutoPostback is set to true, all events will execute theclient handler, then cause a post-back to the web server where thecorresponding server event will be fired. class SmexMessage :SmexMessageBase { string Message {get; set}; string ClientSmexId {get;set}; string OnClientBeforeSend {get; set}; string OnClientReceive {get;set}; event  Receive; }

[1690] 21.2.1.1 SmexMessage Properties

[1691] Message: Required. The CSTA XML message to be sent. An exceptionwill be thrown if Message is not specified.

[1692] OnClientBeforeSend: Optional. Client side function called justbefore the message is sent. This is to give the author a last chance tomodify the message. OnClientBeforeSend returns a string containing thenew message. If null is returned, original message will be sent. Thesignature for OnClientBeforeSend is: function OnClientBeforeSend(objectsender, string Message ) where: sender is the client-side SmexMessageobject, and Message is the original message.

[1693] Receive: Optional. Server side event when client side <smex>object receives smex events. The signature of a ReceiveEventHandler is:

[1694] void ReceiveEventHandler(object sender, ReceiveEventArgs e)

[1695] where

[1696] sender will be the server side SmexMessage object. The secondargument e is of following type: class ReceiveEventArgs : EventArgs {string  Received {get}; } where Received contains the event messagereceived from <smex>.

[1697] OnClientReceive: Optional. Client-side function called whenclient side <smex> object receives smex events. OnClientReceive returnsa bool—true means that this object has got all the events and RunSpeechcan continue, false means that this object expects more events beforeRunSpeech can continue. The signature for OnClientReceive is:

[1698] function OnClientReceive(object sender, string Message)

[1699] where

[1700] sender is the client-side SmexMessage object, and

[1701] Message is the received message.

[1702] ClientSmexId: Optional. This is the client side <smex> elementid. If not set, messages will be sent through the default Call Manager<smex> element. If set to non-empty string, it has be to be id of anexisting SALT <smex> element, which the author has to add to the page.

[1703] 21.2.2 TransferCall

[1704] The TransferCall control transfers the current call using CSTASingleStepTransfer service. When RunSpeech runs this object, it blocksany further speech dialog until transfer succeeds or fails. classTransferCall : SmexMessageBase { string TransferredTo {get; set}; stringOnClientFailed {get; set}; string OnClientTransferred {get; set}; eventTransferred; }

[1705] 21.2.2.1 TransferCall Properties

[1706] TransferredTo: Required. The device identifier associated withthe transferred to endpoint.

[1707] Transferred: Optional. Server side event fired when the call istransferred. The signature of an EventHandler is:

[1708] void EventHandler(object sender, EventArgs e);

[1709] where

[1710] sender is the server side TransferCall object, and

[1711] e is of the standard EventArgs type.

[1712] OnClientTransferred: Optional. Client side function called whenthe call is transferred. OnClientTransferred returns nothing. Thesignature of OnClientTransferred is

[1713] function OnClientTransferred(object sender)

[1714] where:

[1715] sender is the client-side TransferCall object.

[1716] OnClientFailed: Client-side function called when CSTA returnsFAILED event. OnClientFailed returns a bool—true to continue RunSpeechand false to go to error page. The signature for OnClientFailed is:

[1717] function OnClientFailed(object sender, string cause)

[1718] where

[1719] sender is the client-side TransferCall object, and

[1720] cause is the reason for failure returned from <smex>.

[1721] 21.2.3 MakeCall

[1722] The MakeCall control makes an outbound call to the given numberon the given device when RunSpeech runs this object. Further speechdialog is blocked until the call is connected or fails to connect. classMakeCall : SmexMessageBase { string CallingDevice {get; set} stringCalledDirectoryNumber {get; set}; string OnClientFailed {get; set};string OnClientConnected {get; set}; event Connected; }

[1723] 21.2.3.1 MakeCall Properties

[1724] CallingDevice: Required. Default is the internal CallInforDeviceId. The control will use this device to place the outbound call.

[1725] CalledDirectoryNumber: Required. Phone number to dial. Anexception will be thrown if CalledDirectoryNumber is not specified.

[1726] Connected: Server side event when the call is connected. Thesignature of an EventHandler is:

[1727] void EventHandler(object sender, EventArgs e)

[1728] where

[1729] sender is the server side MakeCall object, and

[1730] e is of the standard EventArgs type.

[1731] At this point, the CurrentCall property should contain theinformation about the call in progress.

[1732] OnClientConnected: Client side function called when the call isconnected. OnClientConnected returns nothing. The signature forOnClientConnected is:

[1733] function OnClientConnected(object sender, stringCalledDirectoryNumber

[1734] where:

[1735] sender is the client-side MakeCall object, and

[1736] CalledDirectoryNumber is the property of the MakeCall object.

[1737] OnClientFailed: Client side function called when CSTA returnsFAILED event. OnClientFailed returns a bool—true to continue RunSpeechand false to goto error page. The signature for OnClientfailed is:

[1738] function OnClientFailed(object sender, string cause)

[1739] where

[1740] sender is the client-side MakeCall object, and

[1741] cause is the reason for failure returned from <smex>.

[1742] 21.2.4 DisconnectCall class DisconnectCall : SmexMessageBase {string OnClientFailed {get; set}; string OnClientDisconnected {get;set}; event Disconnected; }

[1743] 21.2.4.1 DisconnectCall Properties

[1744] Disconnected: Optional. Server side event when the call isdisconnected. The signature of an EventHandler is:

[1745] void EventHander(object sender, EventArgs e)

[1746] where:

[1747] sender is the server side DisconnectCall object and,

[1748] e is of the standard EventArgs type.

[1749] OnClientDisconnected: Optional. Client side function called whenthe call is disconnected. OnClientDisconnected returns nothing. Thesignature for OnClientDisconnected is:

[1750] function OnClientDisconnected(object sender)

[1751] where sender is the client-side Disconnect Call object.

[1752] OnClientFailed: Optional. Client side function called when CSTAreturns FAILED event. OnClientFailed returns a bool—true to continueRunSpeech and false to goto error page. The signature for OnClientFailedis:

[1753] function OnClientFailed(object sender, string cause)

[1754] where

[1755] sender is the client-side Disconnect Call object, and

[1756] cause is the reason for failure returned from <smex>.

[1757] 21.2.5 AnswerCall

[1758] The AnswerCall control answers incoming calls on the givendevice. When activated, this object will block RunSpeech until anincoming call is answered.

[1759] Server-Side Class: class AnswerCall : SmexMessageBase { stringOnClientConnected {get; set}; string OnClientFailed {get; set}; eventConnected; }

[1760] 21.2.5.1 AnswerCall Properties

[1761] Connected: Optional. Server side event when the call isconnected. The signature of a ConnectedEventHandler is:

[1762] void EventHandler(object sender, EventArgs e)

[1763] where:

[1764] sender is the server side AnswerCall object and

[1765] e is of the standard EventArgs type.

[1766] At this point, the CurrentCall property should containinformation of the call in progress.

[1767] OnClientConnected: Optional. Client side function called when thecall is connected. OnClientConnected returns nothing. The signature forOnClientConected is:

[1768] function OnClientConnected(object sender, string callid, stringCallingDevice, string CalledDevice)

[1769] where:

[1770] sender is the client side AnswerCall object

[1771] callid is the id of the current call

[1772] CallingDevice is the caller's network device id

[1773] CalledDevice is the recipient's network device id.

[1774] OnClientFailed: Optional. Client side function called when CSTAreturns FAILED event. OnClientFailed returns a bool—true to continueRunSpeech and false to go to error page. The signature of OnClientFailedis:

[1775] function OnClientFailed(object sender, string cause)

[1776] where

[1777] sender is the client-side AnswerCall object.

[1778] cause is the reason for failure returned from <smex>.

[1779] 22 RunSpeech

[1780] 22.1 Dialog Processing Algorithm

[1781] The RunSpeech algorithm is used to drive dialog flow on avoice-only client. This involves system prompting and dialog managementand processing of speech input. It is specified as a script filereferenced by URI from every relevant speech-enabled page (equivalent toinline embedded script).

[1782] Important: the RunSpeech script will be completely exposed to thepublic. Since it will be hosted on the application web site, authors ofdialogs will be at liberty to examine, edit, replace or ignore theRunSpeech script code.

[1783] Rendering of the page for voice only browsers is done in thefollowing manner:

[1784] The RunSpeech function works as follows (RunSpeech is called inresponse to document.onreadystate becoming “complete”):

[1785] Controls considered for activation are the QA, CompareValidatorand CustomValidator controls.

[1786] 1. Find the first active QA or Validator control in speech indexorder (determining whether a QA/Validator is active is explained below).

[1787] 2. If there is no active control, submit the page.

[1788] 3. Otherwise, run the control.

[1789] A QA is considered active if and only if:

[1790] 1. The QA's clientActivationFunction either is not present orreturns true, AND

[1791] 2. If the Answers collection is non empty, the State of at leastone of the SemanticItems pointed to by the set of Answers is Empty OR

[1792] 3. If the Answers collection is empty, the State at least oneSemanticItem in the Confirm array is NeedsConfirmation.

[1793] However, if the QA has PlayOnce true and its Prompt has been runsuccessfully (reached OnComplete) the QA will not be a candidate foractivation.

[1794] A QA is run as follows:

[1795] 1. If this is a different control than the previous activecontrol, reset the prompt Count value.

[1796] 2. Increment the Prompt count value

[1797] 3. If PromptSelectFunction is specified, call the function andset the Prompt's inlinePrompt to the returned string.

[1798] 4. If a Reco object is present, start it. This Reco shouldalready include any active command grammar.

[1799] 5. Start the DMTF object if present. (Same concerns apply withregard to command Dtmf grammars).

[1800] A Validator (either a CompareValidator or a CustomValidator) isactive if:

[1801] 1. The SemanticItemToValidate has not been validated by thisvalidator.

[1802] A CompareValidator is run as follows:

[1803] 1. Compare the values of the ElementToCompare or ValueToCompareand SemanticItemToValidate ToValidate according to the validator'sOperator.

[1804] 2. If the test returns false, empty the text field of theSemanticItemToValidate (or both if the InvalidateBoth flag is set) andplay the prompt.

[1805] 3. If the test returns true, mark the SemanticItemToValidate asvalidated by this validator.

[1806] A CustomValidator is run as follows:

[1807] 1. The ClientValidationFunction is called with the value of theSemanticItemToValidate.

[1808] 2. If the function returns false, the semanticItem cleared andthe prompt is played, otherwise as validated by this validator.

[1809] A Command is considered active if and only if:

[1810] 1. It is in Scope, AND

[1811] 2. There is not another Command of the same Type lower in thescope tree.

[1812] 22.2 LastCommandOrException

[1813] LastCommandOrException is a global variable and its value ispassed to several author-defined functions as a parameter.

[1814] LastCommandOrException is a global variable maintained byRunSpeech. The value is set to the last Command.Type or recognitionexception that occurred. The value will be reset to “” when there is aQA transition (the current active QA is different than the previouslyactive QA, or is the first active QA). There is one exception to thisrule: If the QA is in a Short time-out confirmation state, and thecurrent recognition result is “Silence”, the LastCommandOrException willbe set to “” (silence in Short time-out confirmation is not anexception, but a valid input.)

[1815] In this fashion, ClientActivationFunction will always get theLastCommandOrException that occurred anywhere in the page, but the restof the functions of the active QA will only get a non-emptyLastCommandOrException if they have been activated more than once in arow.

[1816] If, after processing all the Answers, ExtraAnswers and Confirmsin a QA, nothing is matched (either due to a mismatch in the smlreturned or to a high reject threshold), the LastCommandOrException willbe set to “NoReco”.

[1817] Active Validators will also reset the globalLastCommandOrException.

[1818] Possible values of LastCommandOrException are: platform eventLastCommandOrException Prompt fires an onerror event “PromptError”. Recofires an onerror event “RecoError”. Dtmf fires an onerror event“DtmfError”. Reco fires an onnoreco event “NoReco”. Reco fires a silenceevent “Silence”. Command is Activated Command.type Transition to new QA“”

[1819] Also, a PromptSelectFunction's LastCommandOrException will havethe value “ShortTimeoutConfirmation” when its QA is in Short Time-outConfirmation mode (i.e., when count==1, firstInitialTimeout is non-zero,etc.)

[1820] 22.3 Count

[1821] Count is exclusively local—both in ClientActivationFunction andthe rest of the functions which are passed count. That is, thesefunctions are always passed the count of their own QA. To avoidconfusion, the function ClientActivationFunction will receive the valuethat the PromptSelectFunction would receive if this QA was active.

[1822] 22.4 Postback Support

[1823] In their simplest form, ASP.NET pages are stateless. They areinstantiated, executed, rendered, and disposed of on every round trip tothe server. In the visual world, ASP.NET provides the ViewStatemechanism to keep track of server control state values that don'totherwise postback as part of an HTTP form. The ASP.NET framework usesViewState to manage and restore page properties prior to and afterpostback.

[1824] For voice-only pages, the ASP.NET ViewState mechanism is notavailable to the web developer. However, a similar mechism is providedby RunSpeech. RunSpeech maintains an object that can be used to storevalues which authors wish to be persisted across postbacks. The syntaxis:

[1825] RunSpeech.ClientViewState[“MyVariableName”]myVariableValue;

[1826] Any JScript built-in type can be persisted—string, number,boolean, array, object, Date, RegExp, or function. The main differencebetween the ASP.NET ViewState (for visual pages) and the voice-onlyClientViewState mechanism is that authors of voice-only pages mustmanually declare and set values they wish to maintain across postbacks.

[1827] If AutoPostBack is set to true in any speech control, thematching client-side function will always be executed before postingback to the server. If the author wishes to persist any page stateacross postback, these client-side functions are a good place to invokethe ClientViewState object of RunSpeech.

[1828] 23 Confirmation Algorithm

[1829] Semantic Processing Algorithm:

[1830] There are three stages for semantic processing:

[1831] 1) Preprocessing, Carried Out when a QA is Active:

[1832] This stage is responsible for creating the array of answers to beconsidered in this iteration. This includes all the Answers and theConfirms that need confirmation. Internally, it creates a structure asfollows. Answer ID CurrentValue Answer ID CurrentValue

[1833] This information that is also passed to the PromptSelectFunction,GrammarSelectFunction, etc.

[1834] 2) Answer Processing

[1835] In this stage, we process the Answer objects in the Answers andExtraAnswers collections. If any item from the Answers collection ismatched, a flag indicating this fact is set.

[1836] Answer processing sets the confirmation status of the associatedsemantic item—this status can be either NEEDS_CONFIRMATION or CONFIRMED.If the confidence value associated with the smlNode specified by theAnswer's XpathTrigger is less than or equal to the Answer'sconfirmationThreshold, the status of the semantic item is set toNEEDS_CONFIRMATION. Otherwise it is set to CONFIRMED.

[1837] 3) Confirmation Processing:

[1838] a) Examine at the sml document and search for XpathAcceptConfirmsand XpathDenyConfirms. Set a global confirmation state to NEUTRAL (nonewas present), ACCEPT (xpathAcceptConfirms was present) or DENY(XPathDenyConfirms was present). In short-timeout confirmation, silencesets the confirmation state to ACCEPT.

[1839] b) For all items to be confirmed,

[1840] If there is a value in the sml document that matches theXpathTrigger of the confirm item

[1841] If the new value is the same as the value to be confirmed, theitem is confirmed

[1842] else, the item is set to the new value, and processed as ananswer.

[1843] c) If no Answer object is matched from the Answers or Confirmscollections,

[1844] If the confirmation state is CONFIRM

[1845] Upgrade all items that need confirmation to confirmed.

[1846] If the confirmation state is DENY

[1847] Clear (empty) all items that need confirmation.

[1848] Else,

[1849] Mark all unmatched items that needed confirmation as confirmed.

[1850] 24 Exceptions

[1851] The following table lists the exceptions thrown by SpeechControls during render time. Attribute/ Control/ Method/ object ObjectCondition Exception QA SpeechIndex SpeechIndex < 0ArgumentOutOfRangeException XpathDenyConfirms XpathDenyConfirmsArgumentNullException not specified if Confirm specified Answers Answerscollection ArgumentException contains a non- answer object Prompt Promptnon- ArgumentNullException existant in Voice- only mode QAFirstInitialTimeout FirstInitialTimeout InvalidOperationExceptionspecified without Confirms being specified FirstInitialTimeoutFirstInitialTimeout < 0 ArgumentOutOfRangeExceptionAcceptRejectThreshold AcceptRejectThreshold <ArgumentOutOfRangeException 0 or > 1 DenyRejectThresholdDenyRejectThreshold < ArgumentOutOfRangeException 0 or > 1 CommandSpeechIndex SpeechIndex < 0 ArgumentOutOfRangeException Scope Scope notvalid ArgumentException Scope Scope not ArgumentNullException specifiedType Type not specified ArgumentNullException Type/Scope More than 1InvalidOperationException Command of same Type has same ScopeAcceptCommandThreshold AcceptCommandThreshold <ArgumentOutOfRangeException 0 or > 1 XpathTrigger XpathTrigger notArgumentNullException specified AutoPostBack AutoPostBack isInvalidOperationException true and Triggered handler not specifiedAutoPostBack AutoPostBack is InvalidOperationException false andTriggered handler is specified CompareValidator SpeechIndex SpeechIndex< 0 ArgumentOutOfRangeException SemanticItemToCompare one ofInvalidOperationException SemanticItemToCompare and ValueToCompare isnot specified ValueToCompare one of InvalidOperationExceptionSemanticItemToCompare and ValueToCompare is not specified ValueToCompareValueToCompare can InvalidOperationException not be converted to a validType. SemanticItemToValidate SemanticItemToValidateArgumentNullException not specified CustomValidator SpeechIndexSpeechIndex < 0 ArgumentOutOfRangeException SemanticItemToValidateSemanticItemToValidate ArgumentNullException not specifiedClientValidationFunction ClientValidationFunction ArgumentNullExceptionnot specified Answer XpathTrigger XpathTrigger not ArgumentNullExceptionobject specified for Answers or ExtraAnswwers ConfirmThresholdConfirmThreshold < ArgumentOutOfRangeException 0 or > 1 Reject Reject <0 or > 1 ArgumentOutOfRangeException AutoPostBack Answer.TriggeredInvalidOperationException has a handler but Answer.AutoPostBack is falseSemanticItemobject TargetElement TargetElement specifies multiple idsTargetAttribute TargetAttribute is ArgumentNullException not specifiedwhen TargetElement is specified BindAt BindAt set to anArgumentException invalid value BindAt BindAt is “server”ArgumentException and SemanticItem.TargetElement is not a server-sidecontrol BindAt BindAt is “server” ArgumentException andSemanticItem.TargetAttribute is not a member of the control specified bySemanticItem.TargetElement BindAt BindAt is “server” ArgumentExceptionand SemanticItem.TargetAttribute is a member ofSemanticItem.TargetElement, but is not of type string, BindAt BindAt is“server” ArgumentException and SemanticItem.TargetAttribute is a string,but is read-only. Reco initialTimeout initialTimeoutArgumentOutOfRangeException object negative babbleTimeout babbleTimeoutArgumentOutOfRangeException negative maxTimeout maxTimeoutArgumentOutOfRangeException negative endSilence endSilenceArgumentOutOfRangeException negative reject reject < 0 or > 1ArgumentOutOfRangeException Grammars Grammars ArgumentExceptioncollection contains a non- grammar object Params name not specifiedArgumentNullException Params contains a non- ArgumentException paramobject Grammar src/InlineGrammar one of src or ArgumentNullExceptionobject InlineGrammar is not specified Prompt Params name not specifiedArgumentNullException object Params contains a non- ArgumentExceptionparam object Dtmf object initialTimeout initialTimeout < 0ArgumentOutOfRangeException interdigitTimeout interdigit-ArgumentOutOfRangeException Timeout < 0 endSilence endSilence < 0ArgumentOutOfRangeException Params name not specifiedArgumentNullException Params contains a non- ArgumentException paramobject \ StyleSheet contains an object ArgumentException which is not aStyle object Style StyleReference StyleReference is ArgumentExceptionobject invalid SmexMessageBase Timer Timer < 0ArgumentOutOfRangeException SmexMessage Message Message notArgumentNullException specified MakeCall CalledDirectoryNumberCalledDirectoryNumber ArgumentNullException not specified

[1852] 26 Terms and Defintions Term Definition Voice- A mode of dialogthat utilizes only speech input and only ouput. There are no visualelements presented to the end user. Voice-only dialog typically impliesthe end user communication via the telephone. However, voice- onlyinteraction may occur in a desktop computer setting. Multi- A mode ofdialog that utilizes speech input and visual modal ouput. Multimodaltypically implies end user communication with a dialog via a hand-heldcomputing device such as a pocket PC. Tap- A form of dialog interactionthat utilizes speech and- input and visual ouput. This form of dialogtalk interaction typically occurs on a hand-held computer such a pocketPC. The end user selects (“taps”) the visual element with a stylus orpen-like device and provides input to the visual element using speech(“talk”). Mixed A form of dialog interaction model, whereby the userIni- is permitted to share the dialog initiative with the tia- system,eg by providing more answers than requested by tive a prompt, or byswitching task when not prompted to do so. SAPI SAPI Semantic markuplanguage. The XML document SML returned by SAPI 6.0 when an utterance isdetermined to be in-grammar. (SAPI SML is a SAPI-specific return format.Speech tags interpreters are agnostic to the actual content format ofthe returned document, provided it is an XML document). SAPI SMLcontains semantic values, confidence scores and the words used by thespeaker. (It is generated by script or XSLT instructions containedwithin the grammar rules.) SAPI SML is described in greater detail inthe Speech Core document SML Generation.. CSTA Computer SupportedTelecommunications Applications - an ECMA standard. From the ECMAdocument: “CSTA is an interface that provides access totelecommunication functions that may be used with your phone (or manyother communication devices) and may also be used by 3rd partyapplications such as Contact/Call Centres (e.g. ACD systems).”http://www.ecma.ch/ecma1/TOPICS/TC32/TG11/CSTA.HTM Sys- A form of dialoginteraction model, whereby the system tem holds the initiative, anddrives the dialog with Ini- typically simple questions to which only asingle tia- answer is possible. tive XPath XML Path language, a W3Crecommendation for addressing parts of an XML document. Seehttp://www.w3.org/TR/xpath.

[1853] 27 Platform Parameter Settings

[1854] The <param> mechanism (described in sections Error! Referencesource not found. Prompt object contents, Error! Reference source notfound. Reco object contents and Error! Reference source not found. Dtmfobject contents) 31 is used to configure platform settings. Thefollowing “params” are recognized by all Microsoft platforms: ObjectName Value Default Description Prompt ser- URI http://localhost Thisconfiguration ver des- (client) setting selects the cribing and registryspeech server used the setting for speech location (telephony processingof the server) speech server bar- This The default The barge-in typesgein attribute setting is are defined as: type sets the “speech”. Ifspeech: type the platform This represents of does notspeech/sound/energy recog- support the (“SOUND_START”) nition typedetected by the input selected, the recognition engine. event browsergrammar: This that the defaults to represents the audio browser“speech”. partially matching uses the recognition to grammar. The speechdetermine server will generate whether a “PHRASE_START” an event, andpossibly onbargein a semantic event (a event semantic property in shouldthe phrase hypo- be fired. thesis has confi- There are dence greaterthree than the confidence types of threshold). The bargein- clientdecides when type to throw “onbar- that can gein” based on be set: thecapabilities “speech”, sent by the speech “grammar” server when a ses-and sion is opened. The “final”. confidence thresh- old used by thesemantic event is a client platform setting. final: This representsusing a “valid” final recognition result (i.e. a result where theutterance confidence level is above the “reject” threshold). Run inconjunction with multiple recognition mode, this repre- sents therecognizer continuously lis- tening for a valid result, for hotword/wake-up style scenarios. Note that in this case the browser must fireonbargein before firing onreco. Reco ser- URI http://localhost Thisconfiguration ver des- (client) setting selects the cribing and registryspeech server used the setting for speech location (telephony processingof the server) speech server

[1855] 28 DET Descriptions

[1856] The following table lists brief descriptions for each control,object and attribute. These descriptions will be used by the DET tooland exposed to the dialog author using Visual Studio. Control/Attribute/Method/ object Object Brief description QA Id Programmaticname of the control SpeechIndex Activation order of the controlClientActivationFunction Client-side function used to determine whetheror not to activate the QA control OnClientActive Client-side functioncalled after QA is determined to be active OnClientComplete Client-sidefunction called after execution of QA (successfully or not).OnClientListening Client-side function called after successful start ofthe reco object AllowCommands Whether or not Commands may be activatedfor this QA PlayOnce Whether or not this QA may be activated more thanonce per page XpathAcceptConfirms The path in the sml document thatindicates the confirm items were accepted XpathDenyConfirms The path inthe sml document that indicates the confirm items were deniedFirstInitialTimeout Specifies initial timeout when QA.Count = = 1.Answers An array of answer objects ExtraAnswers An array of answerobjects Confirms An array of answer objects. Prompt The Prompt objectfor this QA Reco The Reco object for this QA Dtmf The Dtmf object forthis QA Command Id Programmatic name of the control SpeechIndexActivation order of the control Scope The id of ASP.NET control thatactivates this Command grammar Type The type of this Command in order toallow the overriding of identically typed commands XpathTrigger SMLdocument path that triggers this command AcceptCommandThresholdConfidence level of recognition that is necessary to trigger thiscommand OnClientCommand Function to execute on recognition of thisCommand's grammar AutoPostBack Whether or not Command control posts backto server when Command grammar is recognized. Prompt A Prompt objectGrammar The grammar object which will listen for the command Dtmf TheDtmf object which will activate the command CompareValidator IdProgrammatic name of the control SpeechIndex Activation order of thecontrol Type Sets the datatype of the comparison ElementToCompare TheJScript variable or Id of the SemanticItem used as the basis for thecomparison SemanticItemToValidate The Id of the control that is beingvalidated SemanticItemToCompare The Id of the control that is the basisfor comparison Operator Validation operator InvalidateBoth Whether ornot to invalidate both ElementToCompare and ElementToValidate PromptPrompt to indicate the error CustomValidator id Programmatic name of thecontrol SpeechIndex Activation order of the controlSemanticItemToValidate The Id of the control that is being validatedAttributeToValidate Attribute of the ElementToValidate that contains thevalue being validated ClientValidationFunction Validation functionPrompt Prompt to indicate the error Answer object id Programmatic nameof the object XpathTrigger The part of the SML document this answerrefers to ClientNormalizationFunction Function that returns author-specified transformation of the recognized item SemanticItem Thesemantic item to which this answer should be written ConfirmThresholdThe minimum confidence level of recognition necessary to mark this itemas confirmed Reject Rejection threshold for the Answer OnClientAnswerFunction to be called when the XpathTrigger is matched AutoPostBackWhether or not to post back to the server each time user interacts withthe control Prompt object id Programmatic name of the object typeMime-type corresponding to the speech output format prefetch Whether ornot the prompt should be immediately synthesized and cached at browserwhen the page is loaded lang The language of the prompt content bargeinWhether or not the speech platform is responsible for stopping promptplayback when speech or DTMF input is detected. PromptSelectFunctionFunction that selects and/or modifies a prompt string prior to playbackOnClientBookmark Function which is called when a bookmark is reached inthe prompt text during playback OnClientError Function called inresponse to an error event in the client InLinePrompt Text of the promptParams Specifies non-standard speech platform configuration values Recoobject Id Programmatic name of the object StartElement Name of the GUIelement to throw the start event StartEvent Name of the GUI event thatwill activate the underlying client-side Reco object StopElement Name ofthe GUI element to throw the stop event StopEvent Name of the GUI eventthat will deactivate the underlying client-side Reco objectinitialTimeout The time in milliseconds between start of recognition andthe detection of speech babbleTimeout The period of time in millisecondsin which the recognizer must return a result after detection of speechmaxTimeout The period of time in milliseconds between recognition startand results returned to the browser endSilence Period of silence inmilliseconds after the end of an utterance which the recognition resultsare returned Reject The rejection threshold below which the platformwill throw the noReco event Lang The language of the speech recognitionengine Mode Specifies the recognition mode to be followedGrammarSelectFunction Client-side function called prior to starting therecognition process OnCllentSilence Client-side function that will becalled after detecting silence OnClientNoReco Client-side function thatwill be called after detecting no recognition OnClientError Client-sidefunction that will be called after recognition errorsOnClientSpeechDetected Client-side function called when recognitionplatform detects speech Grammars An array of grammar objects. ParamsSpecifies non-standard speech platform configuration values Record Usedfor recording audio input from the user. Grammar id Programmatic name ofthe object type Mime-type of the grammar format used lang Language ofthe grammar src URI of the grammar to load InLineGrammar Text of thegrammar Dtmf object id Programmatic name of the object numDigits Numberof key presses required to end the DTMF collection session autoflushWhether or not to automatically flush the DTMF buffer on the underlyingtelephony interface card before activation terminalChar Terminating keyto end the DTMF collection session initialTimeout Number of millisecondsto wait between activation and the first key press before raising atimeout event interdigitTimeout Number of milliseconds to wait betweenkey presses before raising a timeout event SMLContext DTMF resultswrapped in SML tags OnClientSilence Function that executes if there isno DTMF key press before initialTimeout expires OnClientKeyPressFunction that executes on every pressing of a legal DTMF key.OnClientError Function that executes if serious or fatal error occurswith the DTMF collection/recognition process Params Params Specifiesnon-standard DTMF engine configuration values name The name of theparameter to be configured. record Value The value assigned to the namedparameter enabled Whether or not to record user input. type MIME type ofthe file containing the recorded audio. Whether or not to play a beepbefore recording begins.

Appendix D Overview

[1857] 1 Design Principles

[1858] Application Controls are a means to wrap common speech scenariosin one control. Application Controls must work both in multi-modal andvoice-only modes, except for the Navigator control which is a voice-onlycontrol.

[1859] Application Controls are “companions” to the visual controls. Assuch they may not have all the properties that are needed to run a fullapplication. It is likely that the authors will need to get some piecesof information directly from the visual controls.

[1860] Application controls include a set of default prompts tofacilitate rapid design. Not all prompts are included; in such casesauthors must provide a prompt that makes sense in the context of theapplication. It is recommended that authors use the prompt editor tocreate professional, topical prompts before deploying their application.

[1861] Application controls do not currently have a styleref property.This feature will be added for M4.

[1862] 2 Design Details

[1863] All controls should derive from ApplicationControl orBasicApplicationControl. They inherit from SpeechControlBase andimplement INamingContainer.

[1864] Although not required, all controls will, as much as possible,follow a common coding framework:

[1865] 1. Internal QA's are created in the CreateChildControls methods.

[1866] 2. Script is rendered by overridingISpeechRender.RenderSpeechHtml and SpeechRender.RenderSpeechScript.

[1867] 3. Every control outputs a jscript object to the page. Thisobject contains information related to the control. In particular allbuilt-in functions are part of this object in order to minimize nameclashes.

[1868] 4. All built-in javascript functions are included in a javascriptfile and not in C#. Prompt related functions are put in a file calledControlName-prompt.js. All other functions are put in a file calledControlName-code.js.

[1869] 5. The built-in prompt and grammar libraries are loaded fromresources to allow localization. Only the names of the libraries are inthe resources. The prompts and grammars themselves are in the libraries.

[1870] 3 Deployment

[1871] Application controls will be deployed in a separate dll to theWebServer.

[1872] Application controls might have extra script files, also deployedto the webserver.

[1873] Application controls will be added to the GAC, and will beavailable through the Toolbar in VisualStudio.

[1874] Namespace:

[1875] Microsoft.Web.UI.Speech.ApplicationControls

[1876] Dll:

[1877] Microsoft.Web.UI.Speech.ApplicationControls.dll

[1878] Script:

[1879]%SystemDrive%\Inetpub\wwwroot\aspnet_speech\client_script\en-US\*.js

[1880] Grammar

[1881] %SystemDrive%\Inetpub\wwwroot\aspnet_speech\client_script\en-US\

Controls Reference

[1882] The following reference documents provide more information on theimplementation of Application Controls:

[1883] Speech Controls Functional Specification

[1884] ASP.NET

[1885] 1 Common Attributes

[1886] Application controls derive from one of two base classes. Theseclasses are public and developers of application controls should inheritfrom them. The first base class contains a minimal set of propertiesthat the application controls should support. The second class containsa richer set of properties. Application controls should, if possible,support this richer set. Most application controls will support extraproperties that are not included in the base classes because of they arespecific to each control.

[1887] The two base classes are described below. Some common extraproperties are also mentioned.

[1888] All application controls derive from SpeechControlBase andinherit all its members. All application controls also implementINamingContainer. The inherited members are not listed here.

[1889] 1.1 BasicApplicationControl

[1890] This class is abstract. It inherits from SpeechControlBase andINamingContainer.

[1891] public class abstract BasicApplicationControl :IndexedSpeechControl { bool AllowCommands{get; set; }; intBabbleTimeout{get; set; }; bool Bargein{get; set;}; stringCarrierGrammarUrl{get; set;}; string ClientActivationFunction{get;set;}; int EndSilence{get; set;}; int InitialTimeout{get; set;}; intMaxTimeout {get; set;}; string OnClientActiveFirst{get; set;}; stringOnClientCompleteLast{get; set;}; string PostAnswerCarrierRule{get;set;}; string PreAnswerCarrierRule{get; set;}; stringPromptSelectFunction{get; set;}; string QuestionPrompt{get; set;};string PromptDatabase{get; set;}; }

[1892] 1.1.1 BasicApplicationControl Properties

[1893] AllowCommands

[1894] Optional. Only used in voice-only mode. Default: true. Thisproperty is passed in to all relevant internal QA controls created bythis control.

[1895] BabbleTimeout

[1896] Optional. Used in both multimodal and voice-only modes. Defaultis 0.

[1897] This property is passed in to all the relevant internal QAcontrols created by this control. An exception will be thrown fornegative values of BabbleTimeout.

[1898] Bargein

[1899] Optional. Only used in voice-only mode. Default: true. Specifiesor not the playback of the prompt may be interrupted by the humanlistener. This property is passed in to all the relevant internal QAcontrols created by this control.

[1900] CarrierGrammarUrl

[1901] Optional. Used in both multimodal and voice-only modes. Default:“ ”

[1902] URL for the carrier grammar. This grammar contains carrierphrases such as “I would like” or “please” which may be used by the userbut do not contain semantic information. An exception will be thrown ifa PreAnswerCarrierRule, PostAnswerCarrierRule, PreConfirmCarrierRule, orPostConfirmCarrierRule is specified and CarrierGrammarUrl is notspecified.

[1903] ClientActivationFunction

[1904] Optional. Only used in voice-only mode. Default: “”. Client-sidefunction used to determine whether or not to activate the QAs in thisapplication control. This property is passed in to all the relevantinternal QA controls created by this control.

[1905] EndSilence

[1906] Optional. Used in both multimodal and voice-only modes. For Recoobjects in “automatic” mode, the period of silence in milliseconds afterthe end of an utterance which must be free of speech after which therecognition results are returned. Ignored for Recos of modes other than“automatic”. If not specified, defaults to platform internal value. Anexception will be thrown for negative values.

[1907] InitialTimeout

[1908] Optional. Used in both multimodal and voice-only modes. Nodefault value.

[1909] This property is passed in to all the relevant internal QAcontrols created by this control. An exception will be thrown fornegative values of InitialTimeout.

[1910] MaxTimeout

[1911] Optional. Used in both multimodal and voice-only modes. Defaultis 0.

[1912] This property is passed in to all the relevant internal QAcontrols created by this control. An exception will be thrown fornegative values of MaxTimeout.

[1913] OnClientActiveFirst

[1914] Optional. Used only in voice-only mode. Default: “”. Name of afunction called when the first QA control of the application controlgets activated. OnClientActiveFirst returns no values. The signature forOnClientActiveFirst is:

[1915] function onClientActiveFirst(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1916] where:

[1917] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1918] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal. See SpeechControls Functional Specification for more information on thelastCommandOrException parameter.

[1919] Count is the number of times the first activated QA has beenactivated. Count is always 1.

[1920] SemanticItemList is an associative array that maps semantic itemid to semantic item objects.

[1921] OnClientCompleteLast

[1922] Optional. Used in both multimodal and voice-only modes. Default:“”. Name of a function called when the last QA control of theapplication control is completed. OnClientCompleteLast returns novalues. The signature for OnClientCompleteLast is:

[1923] function onClientCompleteLast(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1924] where:

[1925] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1926] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal. See SpeechControls Functional Specification for more information on thelastCommandOrException parameter.

[1927] Count is the number of times the last QA has been activatedconsecutively. Count is always 1 in voice-only and zero in multimodal.

[1928] SemanticItemList is an associative array that maps semantic itemid to semantic item objects.

[1929] PostAnswerCarrierRule

[1930] Optional. Used in both multimodal and voice-only modes. Default:“ ”

[1931] Name of the rule in the carrier grammar that contains carrierphrases used after providing an answer (e.g., “please”). An exceptionwill be thrown if a PreAnswerCarrierRule is specified andCarrierGrammarUrl is not specified.

[1932] PreAnswerCarrierRule

[1933] Optional. Used in both multimodal and voice-only modes. Default:“ ”

[1934] Name of the rule in the carrier grammar that contains carrierphrases used before providing an answer (e.g., “I would like”). Anexception will be thrown if a PostAnswerCarrierRule is specified andCarrierGrammarUrl is not specified.

[1935] PromptSelectFunction

[1936] Optional. Only used in voice-only mode. Specifies a client-sidefunction that allows authors to select and/or modify a prompt stringprior to playback. The function returns the prompt string.PromptSelectFunction is called once the QA has been activated and beforethe prompt playback begins.

[1937] The signature for PromptSelectFunction is as follows:

[1938] String PromptSelectFunction(string lastCommandOrException, intCount, object SemanticItemList, string QA, object AppControlData)

[1939] where:

[1940] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”). See Speech Controls FunctionalSpecification for more information on the lastCommandOrExceptionparameter.

[1941] Count is the number of times the QA has been activatedconsecutively. Count starts at 1 and has no limit. See Speech ControlsFunctional Specification for more information on the Count parameter.

[1942] SemanticItemList is an associative array that maps semantic itemid to semantic item objects.

[1943] QA is a coded name for the current active QA (e.g., “question”,“confirm”).

[1944] AppControlData contains information pertaining to the applicationcontrol.

[1945] Controls contain built-in prompts for question, confirm, silence,noreco and help. The default behavior is to play the silence, noreco orhelp prompt if appropriate followed by the question or confirm prompt.If the PromptSelectFunction returns null, the default prompt will beplayed.

[1946] QuestionPrompt

[1947] Only used and required in voice-only mode. No default. Text ofthe initial question to be played (e.g., “How many pizzas do youwant?”).

[1948] PromptDatabase

[1949] Optional. Only used in voice-only mode. Default: “” Name of theprompt database.

[1950] 1.2 ApplicationControl

[1951] This class is abstract. It inherits from BasicApplicationControl.

[1952] public class abstract ApplicationControl :BasicApplicationControl { bool AutoPostback{get; set; }; floatConfirmThreshold{get; set;}; float ConfirmRejectThreshold{get; set;};EventHandler CompleteLast; int FirstInitialTimeout{get; set;}; stringMode{get; set;}; string OnClientActive{get; set;}; stringOnClientComplete{get; set;}; string OnClientListening{get; set;}; stringPostConfirmCarrierRule{get; set;}; string PreConfirmCarrierRule{get;set; }; float RejectThreshold{get; set;}; sting StartElement{get; set;}; string StartEvent{get; set; }; sting StopElement{get; set;}; stringStopEvent{get; set; }; }

[1953] 1.2.1 ApplicationControl Properties

[1954] AutoPostback

[1955] Optional. Used in both multimodal and voice-only modes. Defaultis false. If true, the control fires the CompleteLast event immediatelyafter OnClientCompleteLast has executed. If AutoPostback is false thecontrol fires the CompleteLast event when the next post back occurs. Anexception will be thrown if AutoPostback is true and CompleteLast is notspecified.

[1956] ConfirmThreshold

[1957] Optional. Used only in voice-only mode. The minimum confidencelevel of recognition necessary to mark an item as confirmed. Legalvalues are 0-1. Default: 1, i.e., by default confirmation is alwaysperformed. This property is passed in to all the internal QA controlscreated by this control. An exception will be thrown for out of rangevalues.

[1958] ConfirmRejectThreshold

[1959] Optional. Used only in voice-only mode. Legal values are 0-1. TheConfirmRejectThreshold is the threshold above which accept/denialconfidence needs to be in order to accept the accept or deny. Thisthreshold is usually higher than the RejectThreshold which applies toall other answers. This property is passed in to all the relevantinternal confirm answer elements created by this control. An exceptionwill be thrown for out of range values.

[1960] CompleteLast

[1961] Optional. Used in both multimodal and voiced-only modes. Default:null. Specifies a server-side function to be executed when theCompleteLast event is fired. The CompleteLast event is fired after theOnClientCompleteLast function has executed if AutoPostback is true. IfAutoPostback is false, the CompleteLast event is fired at the next postback.

[1962] Mode

[1963] Optional. Used in both multimodal and voice-only modes. Defaultis “automatic”. Specifies the recognition mode to be followed. Legalvalues are “automatic”, “single”, and “multiple”. See the mode propertyof the Reco object in the Speech Control spec for more information.

[1964] OnClientActive

[1965] Optional. Used in both multimodal and voice-only modes. Default:“”. This property is passed in to all the relevant internal QA controlscreated by this control. The OnClientActive function does not returnvalues. The signature for OnClientActive is as follows:

[1966] function OnClientActive(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList) ps where:

[1967] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1968] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal. See SpeechControls Functional Specification for more information on thelastCommandOrException parameter.

[1969] Count is the number of times the current QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. See Speech Controls FunctionalSpecification for more information on the Count parameter.

[1970] SemanticItemList For voice-only mode, SemanticItemList is anassociative array that maps semantic item id to semantic item objects.For multimodal, SemanticItemList will be null.

[1971] OnClientComplete

[1972] Optional. Used in both multimodal and voice-only modes. Default:“”.

[1973] This property is passed in to all the internal QA controlscreated by this control.

[1974] The onClientComplete function does not return values. Thesignature for onClientComplete is as follows:

[1975] function onClientComplete (string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1976] where:

[1977] eventsource is the id of the object (specified by Reco.StopEvent)whose event stopped the Reco associated with the QA (for multimodal).eventsource will be null in voice-only mode.

[1978] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal. See SpeechControls Functional Specification for more information on thelastCommandOrException parameter.

[1979] Count is the number of times the current QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. See Speech Controls FunctionalSpecification for more information on the Count parameter.

[1980] SemanticItemList For voice-only mode, SemanticItemList is anassociative array that maps semantic item id to semantic item objects.For multimodal, SemanticItemList will be null.

[1981] OnClientListening

[1982] Optional. Used in both multimodal and voice-only modes. Default:“ ”

[1983] This property is passed in to all the internal QA controlscreated by this control. The function does not return any values. Thesignature for OnClientListening is as follows:

[1984] function OnClientListening(string eventsource, stringlastCommandOrException, int Count, object SemanticItemList)

[1985] where:

[1986] eventsource is the id of the object (specified byReco.StartEvent) whose event started the Reco associated with the QA(for multimodal). eventsource will be null in voice-only mode.

[1987] lastCommandOrException is a Command type (e.g., “Help”) or a Recoevent (e.g., “Silence” or “NoReco”) for voice-only mode.lastCommandOrException is the empty string for multimodal. See SeeSpeech Controls Functional Specification for more information on thelastCommandOrException parameter.

[1988] Count is the number of times the current QA has been activatedconsecutively. Count starts at 1 and has no limit for voice-only mode.Count is zero for multimodal. See Speech Controls FunctionalSpecification for more information on the Count parameter.

[1989] SemanticItemList For voice-only mode, SemanticItemList is anassociative array that maps semantic item id to semantic item objects.For multimodal, SemanticItemList will be null.

[1990] Note: OnClientListening is not called in the last QA of eachApplication Control.

[1991] PostConfirmCarrierRule

[1992] Optional. Only used in voice-only mode. Default: “”. Name of therule in the carrier grammar that contains carrier phrases used afterproviding a correction. An exception will be thrown if aPostConfirmCarrierRule is specified and CarrierGrammarUrl (inheritedfrom the BasicApplicationControl class) is not specified.

[1993] PreConfirmCarrierRule

[1994] Optional. Only used in voice-only mode. Default: “”. Name of therule in the carrier grammar that contains carrier phrases used beforeproviding a correction. An exception will be thrown if aPostConfirmCarrierRule is specified and CarrierGrammarUrl (inheritedfrom the BasicApplicationControl class) is not specified.

[1995] RejectThreshold

[1996] Optional. Used in both multimodal and voice-only modes. Legalvalues are 0-1. Default: 0. An exception will be thrown for out of rangevalues.

[1997] This property is passed in to all the internal QA controlscreated by this control.

[1998] StartElement

[1999] Optional. Used only in multimodal mode. Default is “”. Specifiesthe id of the visual control that fires the StartEvent.

[2000] StartEvent

[2001] Optional. Used only in multimodal mode. Default: “”.

[2002] Name of the event that starts recognition in multimodal mode,e.g. “onmousedown”. An exception will be thrown if StartEvent isspecified and StartElement is not.

[2003] StopElement

[2004] Optional. Used only in multimodal mode. Default is “”. Specifiesthe id of the visual control that fires the StopEvent.

[2005] StopEvent

[2006] Optional. Used only in multimodal mode. Default: “”. Name of theevent that stops recognition in multimodal mode, e.g., “onmouseup”. Anexception will be thrown if StopEvent is specified and StopElement isnot.

[2007] FirstInitialTimeout

[2008] Optional. Only used in voice-only mode. Default: 800 Thisproperty is passed in to all the relevant internal QA controls createdby this control. If set to 0, QA controls that use short time-outconfirmation will revert to using explicit confirmation. An exceptionwill be thrown for negative values of FirstInitialTimeout.

[2009] 1.3 Other Properties

[2010] Application Controls dealing with numbers should also supportDTMF. Application Controls that support DTMF must inherit from the IDTMFinterface. The IDTMF interface contains the following method:

[2011] bool AllowDTMF {get; set;}

[2012] Optional. Only used in voice-only mode. Default: true. If set totrue, the controls allow DTMF input. If set to false, DTMF inputs arenot allowed.

[2013] 1.4 Operation

[2014] 1.4.1 Execution Flow

[2015] Each control needs to confirm values as appropriate.

[2016] Confirmation of digit inputs: When getting a series of digitsthat can be split into specific places (e.g., groups of 4 digits for acredit card number, groups of 3, 2 and 4 for a social security number,groups of 5 and 4 for a zipcode number), the control will allow users tostop at those places. If users stop, then the control will immediatelytry to confirm the digits given so far. Confirmation will be done by ashort timeout confirmation of each group. Users can accept (by eithersaying yes or staying silent), deny or correct the value. They cannotprovide more digits at this point. If a denial is made, the controltries to get and confirm the new value immediately. If a correction ismade, the control tries to confirm the new value immediately. Once alldigits are confirmed, the control will ask for more if users did notprovide them already. If the digits given by the user do not needconfirmation because they have been recognized with high enoughconfidence, the control will prompt users to go on (“Go on”). If DTMF isallowed, users can accept the digits by pressing the pound (#) sign.They can also correct by entering the series of digits again. Userscannot deny using DTMF.

[2017] There is no way to cancel or exit out of an Application Control(except the Navigator control) without the author providing a Commandcontrol that implements the functionality.

[2018] 1.4.2 Prompting

[2019] Prompts in all Application Controls behave the same way. Thequestion and confirm prompts are control-specific based on propertiesset in the control.

[2020] The Help prompt for each control consists of a control-specifichelp message followed by either the value of the QuestionPrompt propertyor a replay of the confirmation prompt-depending on progress of dialogflow.

[2021] When the Application Control is not able to recognize user input,the control will issue a noreco prompt followed by either the value ofthe QuestionPrompt property or a replay of the confirmationprompt-depending on progress of dialog flow.

[2022] When the control detects silence, the control will issue asilence prompt followed either by the value of the QuestionPromptproperty or a replay of the confirmation prompt-depending on progress ofdialog flow.

[2023] 1.4.3 Default Grammars

[2024] The grammars built-in the controls are based on the commongrammar library.

[2025] 2 IDTMF Interface

[2026] Controls that support DTMF must inherit from this interface.interface IDTMF { bool AllowDTMF{get; set;}; int InterDigitTimeout{get;set;}; string OnClientKeyPress{get; set;}; bool PreFlush{get; set;}; }

[2027] 2.1 IDTMF Properties

[2028] AllowDtmf

[2029] Required. Determines whether to support DTMF input.

[2030] InterDigitTimeout

[2031] Required. Determines the timeout between keypresses.

[2032] PreFlush

[2033] Required. Determines whether to automatically flush the DTMFbuffer on the underlying telephony interface card before activation.

[2034] OnClientKeyPress

[2035] The name of the client-side event that will be fired each time akey is pressed.

[2036] There are two more properties include:

[2037] int InitialTimeout {get; set;}

[2038] int EndSilence {get; set;}

[2039] which are provided in BasicApplicationControl Properties.

[2040] 3 SingleItemChooser Control

[2041] The SingleItemChooser control allows users to select one itemfrom a list of items. The grammar for selecting the item is created onthe fly based on the data from the list. class SingleItemChooser :ApplicationControl { object DataSource{get; set;}; stringDataMember{get; set;}; string DataTextField{get; set;}; stringDataBindField{get; set;}; ITemplate GrammarTemplate{get; set;}; stringPromptSelectFunction{get; set;}; string SemanticItem{get; set;}; }

[2042] 3.1 SingleItemChooser Properties

[2043] Common properties are described above.

[2044] DataSource

[2045] Required. Used in both multimodal and voice-only modes. Use theDataSource property to specify the source of values to bind to theSingleItemChooser control. An exception will be thrown if DataSource isnot specified. The DataSource property is the same as used in otherASP.NET controls. See ASP.NET documentation for more information on theDataSource property.

[2046] DataMember

[2047] Optional. Used in both multimodal and voice-only modes. Defaultis null.

[2048] A data member from a multimember data source. Use the DataMemberproperty to specify a member from a multimember data source to bind tothe list control. For example, if you have a data source, with more thanone table, specified in the DataSource property, use the DataMemberproperty to specify which table to bind to a data listing control.

[2049] Note on databinding: The resolved data source (datasource anddatamember) must be of one of the following types:

[2050] Array

[2051] Implementer of IList, provided the implementer has a stronglytyped Item property (that is, the Type is anything but Object). You canaccomplish this by making the default implementation of Item private. Ifyou want to create an IList that follows the rules of a strongly typedcollection, you should derive from CollectionBase.

[2052] Implementer of ITypedList.

[2053] The DataMember property is the same as used in other ASP.NETcontrols. See ASP.NET documentation for more information on theDataMember property.

[2054] DataTextField

[2055] Required. Used in both multimodal and voice-only modes. Defaultis null.

[2056] A System.String that specifies the field of the data source thatprovides the grammar for each individual item on the list. The string isa comma-separated list of synonyms. Each synonym is a possible way ofselecting a value. An exception is thrown if this property is specifiedbut the data source does not contain a corresponding column. Anexception is thrown if a synonym can be used to select more than onevalue.

[2057] DataBindField

[2058] Required. Used in both multimodal and voice-only modes. Defaultis null.

[2059] A string that specifies the field of the data source thatprovides the binding values of the list items. If this property isspecified but the data source does not contain a corresponding column,an exception is thrown.

[2060] GrammarTemplate

[2061] Optional. Used in both multimodal and voice-only modes. Defaultis null.

[2062] If specified, the template is used to fill in the grammar thatwill be used for recognition. Each call to the template must return acomma delimited string of terms. Each of the terms is a possible way ofsaying the value. Calls are made with the data obtained from the source.

[2063] PromptSelectFunction

[2064] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “confirm”, or “acknowledge”.See Section 1.1.1 BasicApplicationControl Properties for a descriptionof the PromptSelectFunction and its parameters.

[2065] SemanticItem

[2066] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value of the chosen item. The index ofthe selected item in the list will be added to the expando properties ofthe semantic item as “index”. The spokenText expando property of theSemanticItem will be set to the spoken text used by the user to selectthe item. An exception will be thrown if SemanticItem is not specifiedor if it is not a valid semantic item, e.g., the id does not correspondto an element on the page or it corresponds to an element that is not asemantic item.

[2067] 3.2 Client-Side Object

[2068] Array AvailableOptions {get;}

[2069] Array of all the choices that can be spoken by the user (notincluding synonyms).

[2070] 3.3 Mark-Up <speech:SingleItemChooser id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” SemanticItem=“...”DataSource=“...” DataMember=“...” DataTextField=“...”DataBindField=“...” runat=“server”> <GrammarTemplate> </GrammarTemplate></speech:SingleItemChooser>

[2071] 3.4 Operation

[2072] 3.4.1 Execution Flow

[2073] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Confirm 2 Question 3 Done

[2074] In multimodal mode, the start event starts recognition for asingle item and binds the value as in voice-only mode.

[2075] If the DataSource contains no items from which to choose, thecontrol does not render.

[2076] 3.4.2 Default Prompts

[2077] The default prompts are:

[2078] Question QA

[2079] Question: Must be specified by user or an error will be returned.

[2080] Help: “Please tell me one of the following choices”+(list ofitems)+Question

[2081] Confirm QA

[2082] Question: “Did you say”+SemanticItem.spokenText+?

[2083] Help: “Please say yes or no, or tell me the correctchoice”+SemanticItem.spokenText+?

[2084] Also, if short timeout confirmation is allowed, i.e.,FirstInitialTimeout>0, the prompt is:

[2085] SemanticItem. spokenText+?

[2086] Done QA

[2087] Prompt:“”

[2088] All QA controls

[2089] Silence: “I didn't hear you.”

[2090] NoReco: “I didn't understand you.”

[2091] 3.4.3 Default Grammar

[2092] The default grammar will list in parallel all the objects in thedata source. The control will put the binding value corresponding to therecognized value into the target element attribute.

[2093] The grammar can be expanded by providing a comma separated listof synonyms rather than a single element. Users can then select the listitems by using any of the synonym names. If the synonym list containsduplicates an exception is thrown.

[2094] Authors can override the default grammar by providing a grammartemplate. This template is called with the data contained in the datasource. This data can be used to create a specific grammar. Here is anexample to allow users to refer to a person in different ways, e.g.,“Nancy”, “Davolio”, “Nancy Davolio”, assuming the data source contains aFirstName and LastName column:

[2095] <grammarTemplate>

[2096] <%# DataBinder.Eval(Container.DataItem, “LastName”) %>, <%#DataBinder.Eval(Container.DataItem, “FirstName”) %>, <%#DataBinder.Eval(Container.DataItem, “FirstName”)< ><%#DataBinder.Eval(Container.DataItem, “LastName”) %></grammarTemplate>

[2097] Here is an example to fetch the grammar from a resource, assumingthat a resource manager has been initialized and the data sourcecontains a LastName column: <grammarTemplate> <%#ResourceManager.GetString(DataBinder.Eval(Container.DataItem,“LastName”)) %> </grammarTemplate>

[2098] 3.4.4 Default Commands

[2099] Default Help

[2100] The default help will present the choices available to the users.In order to activate help, the author needs to create a command of type‘Help’ whose scope contains the application control. If the authorprovides a prompt in the Command control, the prompt in the Commandcontrol will be played before the default prompt.

3.4.5 EXAMPLE

[2101] control: Choose a topping

[2102] User: Pepperoni

[2103] control: Choose a topping

[2104] User: Help

[2105] control: You can choose from Pepperoni, Cheese and Anchovies.

[2106] Choose a topping.

[2107] User: Pepperoni

[2108] 3.5 Future Features

[2109] The following features will be considered for V2 of the Microsoft.NET Speech SDK.

[2110] 3.5.1 Spelling

[2111] When choosing an item by speaking does not work well, e.g.,choosing names may, we could fallback to a spelling mode.

[2112] 3.5.2 Repeated Entries

[2113] We do not currently allow repeated entries in the datasource. Wemay want to investigate how these could be accepted and disambiguated.

[2114] 4 DataTableNavigator Control

[2115] This is a Voice-only control. The DataTableNavigator control willallow users to navigate though a table of caption/content elements.class DataTableNavigator : BasicApplicationControl { longShortlnitialTimeout{get; set;}; object DataSource{get; set;}; stringDataMember{get; set;}; StringArrayList DataHeaderFields{get; set;};StringArrayList DataContentFields{get; set;}; boolDisableColumnNavigation{get; set;}; ITemplate HeaderTemplate{get; set;}; ITemplate ContentTemplate{get; set;}; TemplateCollection Columns{get; set;}; ITemplte GrammarTemplate { get; set; } stringPromptSelectFunction{get; set;}; AccessMode AccessMode { get; set; }Semanticltem SemanticItem { get; set; } GrammarCollection Grammars {get; set; } } enum AccessMode { Fetch, Select, Ignore };

[2116] 4.1 DataTableNavigator Properties

[2117] Common properties are described above.

[2118] ShortinitialTimeOut

[2119] Optional. Default: 1200

[2120] Time in milliseconds before OnSlience is fired. If greater than0, automatic navigation is on and OnSlience navigates to the next row ofavailable data. If set to 0, automatic navigation is turned off. Anexception will be thrown if ShortInitialTimeout is a negative value.

[2121] AccessMode

[2122] Optional. Default: AccessMode.Fetch

[2123] Allows the user to configure the DataTableNavigator to browse to,fetch and exit, and ignore an item in the data set spoken by the user.This behavior is determined by the following options:

[2124] AccessMode.Ignore: The stated name is ignored, and the no recoprompt is played.

[2125] AccessMode.Select: If this flag is set then the Navigator buildsa grammar out of the elements in the header. It does this using exactlythe same mechanism as the ListSelector i.e. allowing the author to use agrammar template to indicate synonyms and also throwing an exceptionwhen duplicate entries are detected.

[2126] When the user speaks a name in the first column the effect is togo to the 1st column entry for that name and behave as through we hadnavigated there by any other means i.e. read the entry out. Followingthis the the Navigator will ask the ‘next command?’ question, regardlessof whether it has been configured to treat Silence as Next. The theoryhere is tat the user definitely wants to do something with the item thatthey have requested by name.

[2127] AccessMode.Fetch: If this flag is set then the Navigator builds agrammar out of the elements in the header. It does this using exactlythe same mechanism as the ListSelector i.e. allowing the author to use agrammar template to indicate synonyms and also throwing an exceptionwhen duplicate entries are detected.

[2128] When the user speaks a name in the first column the effect is toexit the Navigator, setting the sem item with the row index of therecognized 1st column name.

[2129] SemanticItem

[2130] Required. Contains the row index of value spoken by the user.

[2131] Grammars

[2132] Optional. Default is the built-in grammar described in section4.4.3.

[2133] Allows the user to configure the grammar supported by thebuilt-in commands. If a grammar tag is absent, the command will not besupported by the control. If a grammar tag is presented but missing a“src” attribute, the default grammar will be used.

[2134] DataSource

[2135] Required. Use the DataSource property to specify the source ofvalues used by the Navigator control. An exception will be thrown ifDataSource is not specified. The DataSource property is the same as usedin other ASP.NET controls. See ASP.NET documentation for moreinformation on the DataSource property.

[2136] DataMember

[2137] Optional. Default is null.

[2138] A data member from a multimember data source. Use the DataMemberproperty to specify a member from a multimember data source to bind tothe DataTableNavigator control. For example, if you have a data source,with more than one table, specified in the Error! Hyperlink referencenot valid. property, use the DataMember property to specify which tableto bind to a data listing control.

[2139] Note on databinding. The resolved data source (datasource anddatamember) must be of one of the following types:

[2140] Error! Hyperlink reference not valid.

[2141] Implementer of Error! Hyperlink reference not valid, provided theimplementer has a strongly typed Error! Hyperlink reference not valid.property (that is, the Error! Hyperlink reference not valid. is anythingbut Error! Hyperlink reference not valid.). You can accomplish this bymaking the default implementation of Error! Hyperlink reference notvalid. private. If you want to create an Error! Hyperlink reference notvalid. that follows the rules of a strongly typed collection, you shouldderive from Error! Hyperlink reference not valid.

[2142] Implementer of ITypedList.

[2143] The DataMember property is the same as used in other ASP.NETcontrols. See ASP.NET documentation for more information ontheDataMember property.

[2144] DataHeaderFields

[2145] Required. The control will concatenate the content of all theheader fields to create the header prompts.

[2146] DataContentFields

[2147] Required. The control will concatenate the content of all thecontent fields to create the content prompts. For example, assume aDataSource that contains weather information as in the following table:DataHeaderFields DataContentFields Seattle Washington 53 75 ClearSpokane Washinton 68 87 Clear Yakima Washinton 67 89 Partly Cloudy

[2148] When the user navigates to the first row of data, the controlwill prompt the user with “Seattle, Wash.”. If the user issues thecommand “Read”, the control will prompt the user with the low and hightemperatures and the sky conditions.

[2149] DisableColumnNavigation

[2150] Optional. Default: false. If true, name of columns are not addedto the grammar. Only the value of the DataHeader is played.

[2151] HeaderTemplate

[2152] Optional. Default: null. Gets or sets the template that defineshow the headers are played. The way headers are read can be changed byspecifying a template. The following example shows how to change theheader to play a prompt like ‘Employee number ID’.

[2153] <HeaderTemplate>

[2154] Employee number <%# DataBinder.Eval(Container.DataItem,“EmployeeID”)%>

[2155] </HeaderTemplate>

[2156] ContentTemplate

[2157] Optional Default: null

[2158] Gets or sets the template that defines how the contents areplayed. The way contents are read can be changed by specifying atemplate. The following example shows how to change the header to play aprompt like ‘Employee number ID is Name’.

[2159] <ContentTemplate>

[2160] Employee number <%# DataBinder.Eval(Container.DataItem,“EmployeeID”) %> is <%# DataBinder.Eval(Container.DataItem,“LastName”)%>

[2161] </ContentTemplate>

[2162] Columns

[2163] Optional. Default: null. Collection of ColumnTemplate objects.Each ColumnTemplate object allows the specification of the prompt thatwill be played if the user requests that column. The following exampleshows this for the Title column: <columns> <column name=‘Title’><contentTemplate> The title of <%# DataBinder.Eval(Container.DataItem,“LastName”) %>is<%# DataBinder.Eval(Container.DataItem, “Title”) %></contentTemplate> </column> </columns>

[2164] ColumnTemplate's properties are:

[2165] string Name {get; set;}

[2166] Default: “ ”

[2167] Name of the column

[2168] ITemplate ContentTemplate {get; set;}

[2169] Default: null

[2170] Template used to create the prompt for that column

[2171] PromptSelectFunction

[2172] Optional. The QA parameter passed to this function is always %question”.

[2173] The lastCommandOrException argument will take the followingvalues (in addition to the values listed in the description oflastCommandOrException in the Speech Controls Functional Specification):

[2174] NVG_previousOnFirstError when trying to get an item before thefirst one;

[2175] NVG_nextOnLastError when trying to get an item after the lastone;

[2176] NVG_onlyItemError. This error message replacesNVG_previousOnFirstError and NVG_nextOnLastError when there is only oneitem in the datasource.

[2177] NVG_headers when requested to read the headers;

[2178] NVG_contents when requested to read the contents;

[2179] NVG_column when requested to read a specific column name. Thename of the column to read is put in the Arg property of theAppControlData object passed in to the PromptSelectFunction associatedwith this control.

[2180] See Section 1.1.1 BasicApplicationControl Properties for adescription of the PromptSelectFunction and its parameters.

[2181] 4.2 Client-Side Object

[2182] The client-side object contains the following properties:

[2183] int Index {get;}

[2184] Index of the current item. The index is zero-based.

[2185] int Max {get;}

[2186] Total number of items in the data.

[2187] Array[ ][ ] DataTable {get;}

[2188] Table containing the data element. Data[column][index] containsthe Data in column ‘column’ and row ‘index’.

[2189] string PreviousCommandOrException {get;}

[2190] Name of the command or exception before last. Required to dealwith repeats successfully.

[2191] string Arg{get;}

[2192] Name of the column to play when lastCommandOrException isNVG_column.

[2193] 4.3 Mark-Up <speech:DataTableNavigator id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” InitialShortTimeout=“...”DataSource=“...” DataMember=“...” DataHeaderFields=“...”DataContentFields=“...” DisableColumnNavigation=“...”SelectBehaviorMode=“...” PromptSelectFunction=“...” runat=“server”><HeaderPromptTemplate/> <ContentPromptTemplate/> <GrammarTemplate/><columns> <ColumnTemplate/> </columns> <grammars> <grammar type=“Next”src=“...” active=“ true|false”/> <grammar type=“Previous” src=“...”active=“true|false”/> <grammar type=“First” src=“...”active=“true|false”/> <grammar type=“Last” src=“...”active=“true|false”/> <grammar type=“Read” src=“...”active=“true|false”/> <grammar type=“Select” src=“...”active=“true|false”/> <grammar type=“Repeat” src=“...”active=“true|false”/> </grammars> </speech:DataTableNavigator>

[2194] 4.4 Operation

[2195] This control is a voice-only control. It does not output anythingin multi-modal mode.

[2196] 4.4.1 Execution Flow

[2197] In voice only mode, the control execution follows the followingflow:

[2198] If automatic navigation is off:

[2199] 1. Play DataHeaderFields (or prompts returned fromPromptSelectFunction, or prompts specified by HeaderTemplate).

[2200] 2. Ask for command.

[2201] 3. If:

[2202] a. User asks for full content or a specific column, playDataContentFields. Go to 2.

[2203] b. User asks for navigation (previous/next/repeat) go tospecified row. Go to 1.

[2204] c. User utters exit command, stop

[2205] d. User asks for header, go to 1.

[2206] If automatic navigation is on, step 2 is replaced by a shorttimeout after step 1 and silence means “next”. At the bottom of the datarows, the Next On Last Error Message is played, auto navigation isdisabled, then go to 2.

[2207] If the DataSource property contains no data, the control does notrender.

[2208] 4.4.2 Default Prompts

[2209] Question prompt: Question or if Question=then “Next command?”

[2210] Question help: “Please say read, next, previous or cancel.”

[2211] Silence: “I didn't hear you”

[2212] NoReco: “I didn't understand you”

[2213] Previous On First Error Message: “You are already on the firstitem.”

[2214] Next On Last Error Message: “You are already on the last item.”

[2215] Previous/Next On Only Item Error Message: “This is the only itemavailable.”

[2216] 5 AlphaDigit Control

[2217] The AlphaDigit control retrieves a string of numbers and letters.The format of the string is determined by a mask. class AlphaDigit :ApplicationControl { string SemanticItem{get; set}; bool Grouping{get;set;}; string InputMask{get; set;}; string PromptSelectFunction{get;set;}; }

[2218] 5.1 AlphaDigit Properties

[2219] Common properties are described above.

[2220] SemanticItem

[2221] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value spoken by the user. The spokenTextexpando property of the SemanticItem will be set to the spoken text usedby the user to input an alphadigit. An exception will be thrown ifSemanticItem is not specified or if it is not a valid semantic item,e.g, the ID does not correspond to an element on the page or itcorresponds to an element that is not a semantic item.

[2222] Grouping

[2223] Optional. Used in both multimodal and voice-only modes. Default:true. This flag decides whether digit groupings (e.g. Thirteen fifteenfor 1315) are allowed. Grouping can only occur when the input masksspecifies digits using wildcards. For example: [?][?] allows “thirteen”,but [0-9][0-9] does not.

[2224] InputMask

[2225] Required. Used in both multimodal and voice-only modes. TheInputMask defines the format of the input to the AlphaDigit control. Theformat must follow the following rules.

[2226] 1. Each position in valid input strings is characterized by awildcard or a range in brackets.

[2227] 2. A wildcard can be either “A” for an alphabetical character,“D” for a numerical character, or “.” for either a numerical oralphabetical character. Each wildcard represents one character only.

[2228] 3. A range in brackets specifies what characters are acceptable.The allowable characters can be listed without spaces or commas. Forexample:

[2229] [123] allows “one,” “two,” or “three.”

[2230] A single character in brackets is also permitted, i.e., [1] isvalid. A range of allowed characters or numbers can also be specifiedwith a hyphen:

[2231] [1-3] allows values one through three.

[2232] A range specified in the form [x-y] is valid only if x<y. Mutiplerange and/or values can be concatenated together in a set: [1-5a-eiou].Overlapping ranges are allowed; [1-53-8] is valid.

[2233] Wildcard characters are not permitted inside brackets; [A] is notvalid.

[2234] 4. Spaces are permitted anywhere in the input mask string and areignored.

[2235] 5. InputMask syntax is case sensitive. Ranges of letters must bespecified in lowercase, [a-e], and wildcards must be specified in uppercase.

[2236] 6. White space only input masks, and any input masks notconstructed according to the above rules will generate an error atdesign time. Empty input masks will generate an error at runtime.

[2237] PromptSelectFunction

[2238] The QA parameter passed to this function may be either:“question”, “confirm” or “acknowledge”. See Section 1.1.1BasicApplicationControl Properties for a description of thePromptSelectFunction and its parameters.

[2239] 5.2 Client-Side Object

[2240] The client-side object is reserved for future use and is notdocumented at this time.

[2241] 5.3 Mark-Up <speech:AlphaDigit id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” SemanticItem=“...”Grouping=“...” InputMask=“...” runat=“server”/>

[2242] 5.4 Operation

[2243] 5.4.1 Execution Flow

[2244] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Confirm 2 Question 3 Done

[2245] In multimodal mode, the start event starts the recognition forthe whole alpha-digit string and binds the results.

[2246] 5.4.2 Default prompts

[2247] The default prompts are:

[2248] Question QA

[2249] Question: Must be specified by user or an error will be returned.

[2250] Help: “Please tell me a series of letters and or digits”+Question

[2251] Confirm QA

[2252] Confirm: “Did you say”+SemanticItem.spokenText

[2253] ConfirmHelp: “Please say yes or no, or tell me the correct seriesof letters or digits.”

[2254] Also, if short timeout confirmation is allowed, i.e.,FirstInitialTimeout>0, the prompt is:

[2255] SemanticItem.spokenText+?

[2256] Done QA

[2257] Prompt: “ ”

[2258] All QA Controls

[2259] Silence: “I didn't hear you.”

[2260] NoReco: “I didn't understand you.”

5.5 EXAMPLES

[2261] control: “What is the number?”

[2262] User: “one four two five one”

[2263] control: “Did you say 1 4 2 5 1?”

[2264] User: “yes”

[2265] 6 NaturalNumber Control

[2266] The NaturalNumber control retrieves a natural number between 0and 999,999. The NaturalNumber control also inherits from IDTMFinterface. class NaturalNumber : ApplicationControl { stringSemanticItem{get; set;}; int LowerBound {get; set;}; int UpperBound{get;set;}; SemanticEvent ValidationEvent{get; set;}; stringPromptSelectFunction{get; set;}; }

[2267] 6.1 Numeral Properties

[2268] Common properties are describes above.

[2269] SemanticItem

[2270] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value spoken by the user. An exceptionwill be thrown if SemanticItem is not specified or if it is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item.

[2271] LowerBound

[2272] Optional. Used in both multimodal and voice-only modes. Default:0. Lower boundary of acceptable answers. Must be greater than zero andless than UpperBound. An exception will be thrown if LowerBound is lessthan zero or greater or equal to UpperBound.

[2273] UpperBound

[2274] Optional. Used in both multimodal and voice-only modes. Default:999,999. Upper boundary of acceptable answers. An exception will bethrown if UpperBound greater than 999,999 or is less than or equal toLowerBound.

[2275] ValidationEvent

[2276] Optional. Only used in voice-only mode. Default isSemanticEvent.onconfirmed. Must be either SemanticEvent.onconfirmed orSemanticEvent.onchanged. Indicates when the control will validate thatthe number is within the range specified, either after the number isinput (or changed) or after the number has been confirmed.

[2277] PromptSelectFunction

[2278] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “confirm”, “validation”,“acknowledge”. See Section 1.1.1 BasicApplicationControl Properties fora description of the PromptSelectFunction and its parameters.

[2279] 6.2 Client-Side Object

[2280] The object passed to this function contains the followingproperties:

[2281] int LowerBound {get;}

[2282] the lower bound;

[2283] int UpperBound {get;}

[2284] the upper bound;

[2285] 6.3 Mark-Up <speech:NaturalNumber id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” SemanticItem=“...”LowerBound=“...” UpperBound=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”runat=“server”/>

[2286] 6.4 Operation

[2287] 6.4.1 Execution Flow

[2288] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Confirm 2 Question 3 Validate 4 Done

[2289] In multimodal mode, the start event starts recognition for thenumber. If the number is in the lowerbound-upperbound range, the valueis bound.

[2290] 6.4.2 Default Prompts

[2291] The default prompts are:

[2292] Question QA

[2293] Question: Must be specified by user or an error will be returned.

[2294] Question help: Say a number.

[2295] Confirm QA

[2296] Confirm: “Did you say”+SemanticItem.value

[2297] ConfirmHelp: “Confirm by saying yes or no, or tell me the correctnumber”.

[2298] Also, if short timeout confirmation is allowed, i.e.,FirstInitialTimeout>0, the prompt is:

[2299] SemanticItem.value

[2300] Validation QA

[2301] Prompt: “I am expecting a number from lowerbound to upperbound”

[2302] if LowerBound is >0

[2303] Prompt: “I am expecting a number larger than lowerbound andsmaller than upperbound”

[2304] The default lowerbound is zero and the default upper bound is1,000,000.

[2305] if number recognized is > UpperBound

[2306] All QA Controls

[2307] Silence: “I didn't hear you.”

[2308] NoReco: “I didn't understand you.”

6.5 EXAMPLES

[2309] control: “How many do you want?”

[2310] User: “twenty”

[2311] control: “Did you say 20?

[2312] User: “yes”

[2313] 7 Currency Control

[2314] The Currency control retrieves an amount in US Dollars. TheCurrency control also inherits from the IDTMF interface. class Currency: ApplicationControl { string SemanticItem{get; set;}; boolPreferDollars{get; set;}; string PromptSelectFunction{get; set;}; }

[2315] 7.1 Properties

[2316] Common properties are described above.

[2317] SemanticItem

[2318] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value spoken by the user. An exceptionwill be thrown if SemanticItem is not specified or if it is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item.

[2319] PreferDollars

[2320] Optional. Used in both multimodal and voice-only modes. Default:false. When users say an amount like “two fifty”, this can beinterpreted as either $2.50 or $250. If PreferDollars is true, theamount that does not use cents is preferred. Otherwise the amount usingcents is preferred. There is no upper limit on the amount of currencyrecognized using this control, it is the responsibility of theapplication developer to implement any desired limits.

[2321] PromptSelectFunction

[2322] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “confirm” or “acknowledge”.

[2323] See Section 1.1.1 BasicApplicationControl Properties for adescription of the PromptSelectFunction and its parameters.

[2324] 7.2 Client-Side Object

[2325] The client-side object is reserved for future use and is notdocumented at this time.

[2326] 7.3 Mark-Up <speech:Currency id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”SemanticItem=“...” PreferDollars=“...” runat=“server”/>

[2327] 7.4 Operation

[2328] The control understands amounts up to 1 million. Amounts like“two ninety nine” are resolved based on the value of the PreferDollarsproperty.

[2329] 7.4.1 Execution Flow

[2330] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Confirm 2 Question 3 Done

[2331] In multimodal mode, the start event starts recognition for thewhole amount and binds the results.

[2332] 7.4.2 Default Prompts

[2333] The default prompts are:

[2334] Question QA

[2335] Question: Must be specified by user or an error will be returned.

[2336] Question Help: “Please tell me an amount. For example ten dollarsor ten dollars and fifty cents.”+Question

[2337] Confirm QA

[2338] Confirm: “Did you say”+SemanticItem.value

[2339] ConfirmHelp: “Please say yes or no, or tell me the correctamount.”

[2340] If short timeout confirmation is allowed, i.e.,FirstInitialTimeout>0, the prompt is:

[2341] SemanticItem.value+?

[2342] Done QA

[2343] Prompt: “”

[2344] All QA Controls

[2345] Silence: “I didn't hear you”

[2346] NoReco: “I didn't understand you”

[2347] 8 Phone Control

[2348] The Phone control retrieves a 10 digit US Phone number. If theuser includes an extra digit at the beginning of the phone number (suchas a 1 for long distance or a 9 for dial out) the extra digit will bedropped. The Phone control also inherits from the IDTMF interface. classPhone : ApplicationControl { string AreaCodeSemanticItem{get; set;};string LocalNumberSemanticItem{get; set;}; stringExtensionSemanticItem{get; set;}; string StartElementAreaCode {get;set;}; string StartEventAreaCode{get; set;}; stringStopElementAreaCode{get; set;}; string StopEventAreaCode{get; set;};string StartElementLocalNumber{get; set;}; stringStartEventLocalNumber{get; set;}; string StopElementLocalNumber{get;set;}; string StopEventLocalNumber{get; set;}; stringStartElementExtension{get; set;}; string StartEventExtension{get; set;};string StopElementExtension{get; set;}; string StopEventExtension{get;set;}; string PromptSelectFunction{get; set;}; boolRequiresAreaCode{get; set;}; }

[2349] 8.1 Phone Properties

[2350] Common properties are described above.

[2351] AreaCodeSemanticItem

[2352] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the area code value spoken by the user. Ifthe retrieved area code starts with a “1” e.g., “1-800”, the “1” is notreturned in the results. An exception will be thrown ifAreaCodeSemanticItem is not specified or if it is not a valid semanticitem, e.g., the ID does not correspond to an element on the page or itcorresponds to an element that is not a semantic item.

[2353] LocalNumberSemanticItem

[2354] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the local number value spoken by the user.An exception will be thrown if LocalNumberSemanticItem is not specifiedor if it is not a valid semantic item, e.g., the ID does not correspondto an element on the page or it corresponds to an element that is not asemantic item.

[2355] ExtensionSemanticItem

[2356] Optional. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the extension value spoken by the user. Ifspecified the control will allow the user to enter an extension. Themaximum length of the extension is five digits. If specified, anexception will be thrown if ExtensionSemanticItem is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item

[2357] StartElementAreaCode

[2358] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event starts recognition of the area code part.

[2359] StopElementAreaCode

[2360] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event stops recognition of the area code part.

[2361] StartEventAreaCode

[2362] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that starts recognition of the area code part.

[2363] StopEventAreaCode

[2364] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that stops recognition of the area code part.

[2365] StartElementLocalNumber

[2366] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event starts recognition of the local number part.

[2367] StopElementLocalNumber

[2368] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event stops recognition of the local number part.

[2369] StartEventLocalNumber

[2370] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that starts recognition of the local number part.

[2371] StopEventLocalNumber

[2372] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that stops recognition of the local number part.

[2373] StartElementExtension

[2374] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event starts recognition of the extension part.

[2375] StopElementExtension

[2376] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event stops recognition of the extension part.

[2377] StartEventExtension

[2378] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that starts recognition of the extension part.

[2379] StopEventExtension

[2380] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that stops recognition of the extension part.

[2381] PromptSelectFunction

[2382] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “confirmLocalNumber”,“questionAreaCode”, “confirmAreaCode”, “questionExtension”,“confirmExtension”, or “acknowledge”.

[2383] See Section 1.1.1 BasicApplicationControl Properties for adescription of the PromptSelectFunction and its parameters.

[2384] RequiresAreaCode

[2385] Optional. Used in both multimodal and voice-only modes. If true,the control will ask for area code. If false, the control will not askfor area code. 8.2 Client-Side Object

[2386] The client-side object is reserved for future use and is notdocumented at this time.

[2387] 8.3 Mark-Up <speech:Phone id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”StartElementAreaCode=“...” StopElementAreaCode=“...”StartEventAreaCode=“...” StopEventAreaCode=“...”StartElementLocalNumber=“...” StopElementLocalNumber=“...”StartEventLocalNumber=“...” StopEventLocalNumber=“...”StartElementExtension=“...” StopElementExtension=“...”StartEventExtension=“...” StopEventExtension=“...”AreaCodeSemanticItem=“...” LocalNumberSemanticItem=“...”ExtensionSemanticItem=“...” RequiresAreaCode=“...” runat=“server”/>

[2388] 8.4 Operation

[2389] 8.4.1 Execution Flow

[2390] The collection of digits is split into: 3-7-X where X is thenumber of extension digits, up to 5.

[2391] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 QuestionLocalNumber 2 QuestionAreaCode 3ConfirmLocalNumber 4 ConfirmAreaCode 5 QuestionExtension 6ConfirmExtension 7 Done

[2392] In multimodal mode, the start event starts the recognition forthe whole phone number and binds the result. Area code, local number andextension start events start recognition for those semantic itemsseparately.

[2393] 8.4.2 Default Prompts

[2394] The default prompts are:

[2395] QuestionFullNumber:

[2396] Question: Must be specified by user or an error will be returned.

[2397] Help: “Please tell me the phone number.”

[2398] QuestionLocalNumber QA

[2399] Question: QuestionPrompt

[2400] Help: “Please tell me the seven digit local phone number”

[2401] QuestionAreaCode QA

[2402] AreaCodeQuestion: “What is the Area Code?”

[2403] Help: “Please tell me the three digit area code”

[2404] QuestionExtension QA

[2405] ExtensionQuestion: “Any extension?”

[2406] Help: “Please tell me the extension number. Say no extension ifthere is none.”

[2407] ConfirmAreaCode QA

[2408] “Is the area code”+AreaCodeSemanticItem.value+?

[2409] If short timeout confirmation is enabled, i.e.,FirstInitialTimeout>0, then the prompt is:

[2410] AreaCodeSemanticItem.value+?

[2411] ConfirmLocalNumber QA

[2412] “Is the number”+LocalNumberSemanticItem.value+?

[2413] If short timeout confirmation is enabled, i.e.,FirstInitialTimeout>0, then the prompt is:

[2414] LocalNumberSemanticItem.value+?

[2415] ConfirmExtension QA

[2416] If an extension is detected, the prompt is:

[2417] “Is the extension”+ExtensionSemanticItem.value+?

[2418] If short timeout confirmation is enabled, i.e.,FirstInitialTimeout>0, then the prompt is:

[2419] ExtensionSemanticItem.value+?

[2420] If the user says “No” to the QuestionExtension prompt, theconfirm prompt is:

[2421] No extension, is that right?

[2422] All Confirm QA Controls

[2423] Help: “Please say yes or no, or tell me the correct number.”.

[2424] All QA Controls

[2425] Silence: “I didn't hear you.”

[2426] NoReco: “I didn't understand you.”

[2427] 9 ZipCode Control

[2428] The ZipCode control retrieves a US Zip Code. The Zip Code controlalso inherits from the IDTMF interface. class ZipCode :ApplicationControl { string ZipCodeSemanticItem{get; set;}; stringExtensionSemanticItem{get; set;}; string StartElementZipcode{get; set;};string StartEventZipCode{get; set;}; string StopElementZipCode{get;set;}; string StopEventZipCode{get; set;}; stringStartElementExtension{get; set;}; string StartEventExtension{get; set;};string StopElementExtension{get; set;}; string StopEventExtension{get;set;}; string PromptSelectFunction{get; set;}; }

[2429] 9.1 ZipCode Properties

[2430] Common properties are described above.

[2431] ZipcodeSemanticItem

[2432] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the zipcode value spoken by the user. The“value” expando property of the ZipcodeSemanticItem will be set to thetext spoken by the user when entering a zip code. An exception will bethrown if ZipcodeSemanticItem is not specified or if it is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item.

[2433] ExtensionSemanticItem

[2434] Optional. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the extension value spoken by the user. Ifthe extension semantic item id is not specified the control will not askfor an extension and no QA controls related to the extension will beoutput. If specified, an exception will be thrown if the ID does notcorrespond to an element on the page or it corresponds to an elementthat is not a semantic item

[2435] StartElementZipcode

[2436] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event starts recognition of the zipcode.

[2437] StopElementZipcode

[2438] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event stops recognition of the zipcode.

[2439] StartEventZipcode

[2440] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that starts recognition of the zipcode.

[2441] StopEventZipcode

[2442] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that stops recognition of the zipcode.

[2443] StartElementExtension

[2444] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event starts recognition of the extension.

[2445] StopElementExtension

[2446] Optional. Only used in multimodal mode. Default=“”. The id of theGUI control whose event stops recognition of the extension.

[2447] StartEventExtension

[2448] Optional. Only used multimodal mode. Default=“”. The name of theevent that starts recognition of the extension part.

[2449] StopEventExtension

[2450] Optional. Only used in multimodal mode. Default=“”. The name ofthe event that stops recognition of the extension part.

[2451] PromptSelectFunction

[2452] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “questionExtension”,“confirmCode”, “confirmExtension”, “acknowledge”.

[2453] See Section 1.1.1 BasicApplicationControl Properties for adescription of the PromptSelectFunction and its parameters.

[2454] 9.2 Client-Side Object

[2455] The client-side object is reserved for future use and is notdocumented at this time.

[2456] 9.3 Mark-Up <speech:ZipCode id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”StartElementZipcode=“...” StopElementZipcode=“...”StartEventZipcode=“...” StopEventZipcode=“...”StartElementExtension=“...” StopElementExtension=“...”StartEventExtension=“...” StopEventExtension=“...”ZipCodeSemanticItem=“...” ExtensionSemanticItem=“...” runat=“server”/>

[2457] 9.4 Operation

[2458] The control asks the question/confirmation repeatedly until ananswer is obtained with confidence above the ConfirmThreshold or it isconfirmed.

[2459] The collection of digits is split into: 5-4.

[2460] 9.4.1 Execution Flow

[2461] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 ConfirmZipCode 2 ConfirmExtension 3QuestionZipCode 4 QuestionExtension 5 Done

[2462] In multimodal mode, the start event starts the recognition forthe whole zip code and binds the result. Events hooked to individualitems start collection only for the associated item.

[2463] 9.4.2 Default Prompts

[2464] The default prompts are:

[2465] QuestionZipCode QA

[2466] Question: Must be specified by user or an error will be returned.

[2467] Help: “Please tell me the zip code.”

[2468] QuestionExtension QA

[2469] ExtensionQuestion: “Any zip plus four extension?”

[2470] Help: “Please tell me the zip plus four extension, say noextension if there is none”

[2471] ConfirmZipCode QA

[2472] Question: “Did you say”+ZipcodeSemanticItem.value+?

[2473] Confirmation Help: “Please say yes or no or tell me the correctnumber.”

[2474] If short timeout confirmation is enabled, i.e.,FirstInitialTimeout>0, then the prompt is:

[2475] ZipcodeSemanticItem.value+?

[2476] ConfirmExtension QA

[2477] Question: “Did you say”+ExtensionSemanticItem.value+?

[2478] Confirmation: “There is no extension. Is that right?”

[2479] If short timeout confirmation is enabled, i.e.,FirstInitialTimeout>0, then the prompt is:

[2480] ExtensionSemanticItem.value+?

[2481] All QA Controls

[2482] Silence: “I didn't hear you”

[2483] NoReco: “I didn't understand you”

[2484] 10 SocialSecurityNumber Control

[2485] The SocialSecurityNumber control retrieves a US Social Securitynumber. The SocialSecurityNumber control also inherits from the IDTMFinterface. class SocialSecurityNumber : ApplicationControl { stringSemanticItem{get; set;}; string Separator{get; set;}; stringPromptSelectFunction{get; set;}; }

[2486] 10.1 SocialSecurityNumber Properties

[2487] Common properties are described above.

[2488] SemanticItem

[2489] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value spoken by the user. The “value”expando property of SemanticItem will be set to the text spoken by theuser when entering a social security number. An exception will be thrownif SemanticItem is not specified or if it is not a valid semantic item,e.g., the ID does not correspond to an element on the page or itcorresponds to an element that is not a semantic item.

[2490] Separator

[2491] Optional. Used in both multimodal and voice-only modes. Thisstring (like “-”) will be inserted between the fields. The Separator isnot used in the grammar, e.g., “123 dash 45 dash 6789” returns a noreco.

[2492] PromptSelectFunction

[2493] Optional. Only used in voice-only mode. The QA parameter passedto this function may be either: “question”, “questionFiled2”,“questionFiled3”, “confirmFiled1”, “confirmField2”, “confirmField3”,“acknowledge”.

[2494] For confirms, the SemanticItemList parameter will contain onesemantic item object holding the value to confirm.

[2495] See Section 1.1.1 BasicApplicationControl Properties for adescription of the PromptSelectFunction and its parameters.

[2496] 10.2 Client-Side Object

[2497] The client-side object is reserved for future use and is notdocumented at this time.

[2498] 10.3 Mark-Up <speech:SocialSecurityNumber id=“...”SpeechIndex=“...” AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”SemanticItem=“...” Separator=“...” runat=“server”/>

[2499] 10.4 Operation

[2500] The collection of digits is split into: 3-2-4. There are 3 hiddensemantic item objects created to hold values for the 3 parts of a socialsecurity number. The appropriate hidden semantic item object is passedto the PromptSelectFunction during confirmation of the correspondingpart of the social security number. The semantic item object specifiedby the SemanticItem property of the control is filled using the hiddenobjects just before the OnClientCompleteLast function call.

[2501] 10.4.1 Execution Flow

[2502] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Field1Confirm 2 Field2Confirm 3 Field3Confirm 4ConfirmFullNumber 5 MainQuestion 6 Field2Question 7 Field3Question 8Done

[2503] For a social security number gathered outside and passed into theSocialSecurityNumber control for confirmation, the voice-only executionbegins at SpeechIndex 4.

[2504] In multimodal mode, the start event starts the recognition forthe whole social security number and binds the result.

[2505] 10.4.2 Default Prompts

[2506] The default prompts are:

[2507] MainQuestion QA

[2508] Question: Must be specified by user or an error will be returned.

[2509] Help: “Please tell me the social security number.”

[2510] Field Question QA Controls

[2511] Field2 Question: “What are the next two digits?”

[2512] Field3 Question: “What are the last four digits?”

[2513] Help: “Please tell me the remaining digits of the social securitynumber.”

[2514] Field Confirm QA Controls

[2515] “Is the social security number”+SemanticItem.value+?

[2516] If short timeout confirmation is enabled (FirstInitialTimeout>0),the prompt is:

[2517] SemanticItem.value+?

[2518] Help=“Please say yes or no, or tell me the correct digits.”

[2519] Done QA

[2520] Prompt: “ ”

[2521] All QA Controls

[2522] Silence: “I didn't hear you.”

[2523] NoReco: “I didn't understand you.”

[2524] For a social security number gathered outside theSocialSecurityNumber control, the confirmation prompt is: Is your socialsecurity number+SemanticItem.value+?

10.4.3 EXAMPLES

[2525] control: “What is your social security number?”

[2526] User: “one two three four five six seven eight nine”

[2527] control: “1 2 3”

[2528] User: “yes” (or short time out confirmation)

[2529] control: “4 5”

[2530] User: “yes” (or short time out confirmation)

[2531] control: “6 7 8 9”

[2532] User: “” (short time out confirmation)

[2533] (for a social security number gathered outside theSocialSecurityNumber control)

[2534] control: “Is your social security number 1 2 3 4 5 6 7 8 9?”

[2535] User: “No, it's 9 8 7 6 5 4 3 2 1”

[2536] 11 Date Control

[2537] The Date control retrieves a date. class Date :ApplicationControl { string DaySemanticItem{get; set;}; stringMonthSemanticItem{get; set;}; string YearSemanticItem{get; set;};Enumeration DateContextEnumeration; DateContextEnumerationDateContext{get; set;}; bool AllowRelativeDates{get; set;}; boolAllowHolidays{get; set;}; bool AllowNumeralDates{get; set;}; stringPromptSelectFunction{get; set;}; string StartElementDay{get; set;};string StartEventDay{get; set;}; string StartElementMonth{get; set;};string StartEventMonth{get; set;}; string StartElementYear{get; set;};string StartEventYear{get; set;}; string StopElementDay{get; set;};string StopEventDay{get; set;}; string StopElementMonth{get; set;};string StopEventMonth{get; set;}; string StopElementYear{get; set;};string StopEventYear{get; set;}; int FallbackCount{get; set;}; }

[2538] 11.1 Date Properties

[2539] Common properties are described above.

[2540] DaySemanticItem

[2541] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the day value spoken by the user. If thevalue is assumed by the control and the semantic item is empty, the“assumed” expando property of DaySemanticItem will be set to true. Thisproperty is removed when the value is confirmed by the user. The“spokenText” expando property will be set to the text spoken by the userwhich effectively enters the day (e.g., “tomorrow”). An exception willbe thrown if DaySemanticItem is not specified or if it is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item.

[2542] MonthSemanticItem

[2543] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the month value spoken by the user. If thevalue is assumed by the control and the semantic item is empty, the“assumed” expando property of MonthSemanticItem will be set to true.This property is removed when the value is confirmed by the user. The“spokenText” expando property will be set to the text spoken by the userwhich effectively enters the month (e.g., “tomorrow”). An exception willbe thrown if MonthSemanticItem is not specified or if it is not a validsemantic item, e.g., the ID does not correspond to an element on thepage or it corresponds to an element that is not a semantic item.

[2544] YearSemanticItem

[2545] Optional. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the year value spoken by the user. If thevalue is assumed by the control and the semantic item is empty, the“assumed” expando property of YearSemanticItem will be set to true. Thisproperty is removed when the value is confirmed by the user. The“spokenText” expando property will be set to the text spoken by the userwhich effectively enters the year (e.g., “tomorrow”). If specified, anexception will be thrown if YearSemanticItem is not a valid semanticitem, e.g., the ID does not correspond to an element on the page or itcorresponds to an element that is not a semantic item

[2546] If YearSemanticItem is not specified, the control will not askfor the year and no QA controls related to the year will be output.

[2547] DateContext

[2548] Optional. Used in both multimodal and voice-only modes. Default:Neutral. By specifying a DateContext, authors can help the controldisambiguate users' answers. For example, ‘Christmas’ will either referto last or next Christmas depending on the value specified in thisproperty.

[2549] The DateContext property is a DateContextEnumeration datatype andmay be set to one of the following values: “Past”, “Future”, or“Neutral”. Neutral means no preference.

[2550] AllowRelativeDates

[2551] Optional. Used in both multimodal and voice-only modes. Default:false. If AllowRelativeDates is set to true, relative dates like“today”, “next Tuesday” are allowed.

[2552] AllowHolidays

[2553] Optional. Used in both multimodal and voice-only modes. Default:false. If AllowHolidays is set to true, holiday names such as Christmasare recognized.

[2554] AllowNumeralDates

[2555] Optional. Used in both multimodal and voice-only modes. Default:false. If AllowNumeralDates is set to true, we accept the numeral formatlike “eleven five sixty two” as 11/5/1962.

[2556] PromptSelectFunction

[2557] Optioal. Only used in voice-only mode. The QA parameter passed tothis function may be either: “questionDate”, “confirmDate”,“questionDay”, “confirmDay”, “questionMonth”, “confirmMonth”,“questionYear”, “confirmYear”, “validate”.

[2558] StartElementDay

[2559] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event starts recognition of the day.

[2560] StartEventDay

[2561] Optional. Only used in multimodal mode. Default:“”. Name of theevent to start recognition for the day.

[2562] StartElementMonth

[2563] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event starts recognition of the month.

[2564] StartEventMonth

[2565] Optional. Only used in multimodal mode. Default:“”. Name of theevent to start recognition for the month.

[2566] StartElementDay

[2567] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event starts recognition of the year.

[2568] StartEventYear

[2569] Optional. Only used in multimodal mode. Default:“”. Name of theevent to start recognition for the year.

[2570] StopElementDay

[2571] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event stops recognition of the day.

[2572] StopEventDay

[2573] Optional. Only used in multimodal mode. Default:“”. Name of theevent to stop recognition for the day.

[2574] StopElementMonth

[2575] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event stops recognition of the month.

[2576] StopEventMonth

[2577] Optional. Only used in multimodal mode. Default:“”. Name of theevent to stop recognition for the month.

[2578] StopElementYear

[2579] Optional. Only used in multimodal mode. Default:“”. The id of theGUI control whose event stops recognition of the year.

[2580] StopEventYear

[2581] Optional. Only used in multimodal mode. Default:“”. Name of theevent to stop recognition for the year.

[2582] FallbackCount

[2583] Optional. Only used in voice-only mode. Default: 3. Must begreater than or equal to 0. Number of misrecognitions or silences whengathering a full date before the control switches to gatheringindividual date items. If FallbackCount=0, the control switchesimmediately. An exception will be thrown for negative values ofFallbackCount. 11.2 Client-Side Object

[2584] The client-side object is reserved for future use and is notdocumented at this time.

[2585] 11.3 Mark-Up <speech:Date id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” StartElementDay=“...”StopElementDay=“...” StartEventDay=“...” StopEventDay=“...”StartElementMonth=“...” StopElementMonth=“...” StartEventMonth=“...”StopEventMonth=“...” StartElementYear=“...” StopElementYear=“...”StartEventYear=“...” StopEventYear=“...” DaySemanticItem=“...”MonthSemanticItem=“...” YearSemanticItem=“...” AllowRe1ativeDates=“...”AllowHolidays=“...” AllowNumeralDates=“...” FallBackCount=“...”runat=“server”/>

[2586] 11.4 Operations

[2587] 11.4.1 Execution Flow

[2588] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 DateConfirm 2 DateQuestion 3 MonthConfirm 4MonthQuestion 5 DayConfirm 6 DayQuestion 7 YearConfirm 8 YearQuestion 9Validation 10 Done

[2589] The control will turn off the mainQA and mainConfirmQA and tallback to individual QA controls to collect and confirm the day, month andyear information separately when the number of corrections or the countof norecos of either of the two QA controls exceeds FallbackCount.

[2590] Relative dates are always confirmed so that the user can be surethat they have been properly resolved.

[2591] In multimodal mode, the start event starts recognition for thewhole date and binds the result. Individual start events can bespecified to start recognition for a specific part of the date (day,month and year).

[2592] Invalid dates such as Feb. 29, 2001 or April 31 will be rejectedas noreco. When an invalid date has been collected item by item, aninvalid prompt will be played and all semantic items will be reset(value property will be set to “” and status property will be set to“EMPTY”).

[2593] 11.4.2 Default Prompts

[2594] The default prompts are:

[2595] DateQuestion QA

[2596] Question: Must be specified by user or an error will be returned.

[2597] QuestionHelp: “Please tell me a date such as May eleventh thisyear”+Question

[2598] DateConfirm QA

[2599] “Did you say”+normalized(DaySemanticItem.value,MonthSemanticItem.value, YearSemanticItem.value)

[2600] For example: User says “tomorrow”

[2601] Confirm prompt: “Did you say 5 Apr. 2002?”

[2602] ConfirmHelp: “Please say yes or no, or tell me the correct date.”

[2603] MonthQuestion QA

[2604] Question: “Tell me the month.”;

[2605] MonthHelp: “Please tell me the month. For example May.”

[2606] MonthConfirm QA

[2607] “Did you say”+normalized(MonthSemanticItem.value)+?

[2608] For example: User says “5”

[2609] Confirm prompt: “Did you say May?

[2610] DayQuestion QA

[2611] DayQuestion: “Tell me the day of the month.”

[2612] DayHelp: “Please tell me the day of the month, for example, theeleventh.”

[2613] DayConfirm QA

[2614] “Did you say”+normailized(DaySemanticItem.value)+?

[2615] For example: User says “tomorrow”

[2616] Confirm prompt: “Did you say the 5^(th)?

[2617] YearQuestion QA

[2618] YearQuestion: “Tell me the year”;

[2619] Year Help: “Please tell me the year”;

[2620] YearConfirm QA

[2621] “Did you say”+normailized(YearSemanticItem.value)

[2622] For example: User says “2003”

[2623] Confirm prompt: “Did you say two thousand three?

[2624] Validation Prompt

[2625] normalized(DaySemanticItem.value, MonthSemanticItem.value,YearSemanticItem.value)+“is not a valid date”

[2626] All QA Controls

[2627] Silence: “Sorry. I didn't hear you.”

[2628] NoReco: “Sorry. I didn't understand you.”

11.4.3 EXAMPLES

[2629] control: “Tell me the date.”

[2630] User: “July first this year”

[2631] control: “Did you say July the first this year?”

[2632] User: “yes”

[2633] control: “Tell me the date.”

[2634] User: “July first”

[2635] control: “Did you say July the first this year?”

[2636] User: “yes”

[2637] control: “Tell me the date.”

[2638] User: “the first”

[2639] control: “February the first this year?”

[2640] User: “yes”

[2641] 12 YesNo Control

[2642] The YesNo control retrieves a Yes or No answer. The YesNo controlalso inherits from the IDTMF interface. class YesNo : ApplicationControl{ string SemanticItem{get; set;}; }

[2643] 12.1 YesNo Properties

[2644] Common properties are described above.

[2645] SemanticItem

[2646] Required. Used in both multimodal and voice-only modes. The ID ofthe semantic item receiving the value. An exception will be thrown ifSemanticItem is not specified or if it is not a valid semantic item,e.g., the ID does not correspond to an element on the page or itcorresponds to an element that is not a semantic item.

[2647] 12.2 Client-Side Object

[2648] The client-side object is reserved for future use and is notdocumented at this time.

[2649] 12.3 Mark-Up <speech:YesNo id=“...” SpeechIndex=“...”AllowCommands=“...” BabbleTimeout=“...” BargeIn=“...”CarrierGrammarUrl=“...” ClientActivationFunction=“...” EndSilence=“...”InitialTimeout=“...” MaxTimeout=“...” OnClientActiveFirst=“...”OnClientCompleteLast=“...” PostAnswerCarrierRule=“...”PreAnswerCarrierRule=“...” PromptSelectFunction=“...”QuestionPrompt=“...” PromptDatabase=“...” AutoPostback=“...”ConfirmThreshold=“...” ConfirmRejectThreshold=“...” CompleteLast=“...”Mode=“...” OnClientActive=“...” OnClientComplete=“...”OnClientListening=“...” PostConfirmCarrierRule=“...”PreConfirmCarrierRule=“...” RejectThreshold=“...” StartElement=“...”StartEvent=“...” StopElement=“...” StopEvent=“...” AllowDTMF=“...”InterDigitTimeout=“...” OnClientKeyPress=“...” PreFlush=“...”SemanticItem=“...” runat=“server”/>

[2650] 12.4 Operation

[2651] Allows speech-enabled page authors to get a yes-no answer fromusers. The answer can be used to fill in a text box or takeauthor-specified action on yes or no. The control asks thequestion/confirmation repeatedly until an answer is obtained withconfidence above the AcceptThreshold. If DTMF input is enabled, “1”means yes and “2” means no.

[2652] 12.4.1 Execution Flow

[2653] In voice only mode, the control execution follows the followingflow: SpeechIndex QA 1 Confirm 2 Question 3 Done

[2654] 12.4.2 Default Prompts

[2655] The default prompts are: Question QA

[2656] Question: Must be specified by user or an error will be returned.

[2657] Question Help: “Please tell me yes or no.”

[2658] Confirm QA

[2659] Confirmation: “Did you say:”

[2660] Confirmation help: “Say yes or no.” (the confirmation prompt isnot replayed after the help prompt)

[2661] Done QA

[2662] Prompt: “ ”

[2663] All QA Controls

[2664] Silence: “I didn't hear you”

[2665] NoReco: “I didn't understand you”

[2666] 13 Exceptions

[2667] The following table lists the exceptions thrown by the controlsat render time. Control/ Attribute/ Object Method Condition ExceptionBasicApplicationControl class EndSilence EndSilence ,)ArgumentOutOfRangeException BabbleTimeout BabbleTimeout < 0ArgumentOutOfRangeException PreAnswerCarrierRule PreAnswerCarrierRuleInvalidOperationException is specified and CarrierGrammarUrl is notspecified. PostAnswerCarrierRule PostAnswerCarrierRuleInvalidOperationException is specified and CarrierGrammarUrl is notspecified. PreConfirmCarrierRule PreConfirmCarrierRuleInvalidOperationException is specified and CarrierGrammarUrl is notspecified. PostConfirmCarrierRule PostConfirmCarrierRuleInvalidOperationException is specified and CarrierGrammarUrl is notspecified. InitialTimeout InitialTimeout < 0 ArgumentOutOfRangeExceptionMaxTimeout MaxTimeout < 0 ArgumentOutOfRangeException ApplicationControlclass AutoPostback AutoPostback is InvalidOperationException true andCompleteLast not specified ConfirmThreshold ConfirmThreshold < 0ArgumentOutOfRangeException or > 1 ConfirmRejectThresholdConfirmRejectThreshold < 0 ArgumentOutOfRangeException or > 1FirstInitialTimeout FirstInitialTimeout < 0 ArgumentOutOfRangeExceptionRejectThreshold RejectThreshold < 0 ArgumentOutOfRangeException or > 1StartEvent StartEvent is InvalidOperationException specified andStartElement is not. StopEvent StopEvent is InvalidOperationExceptionspecified and StopElement is not. SingleItemChooser DataSourceDataSource not ArgumentNullException specified DataTextField Missingfrom ArgumentException database DataBindField Missing fromArgumentException database DataTextField Duplicates in ArgumentExceptiondatabase SemanticItem SemanticItem not ArgumentNullException specifiedSemanticItem SemanticItem is not ArgumentException a valid semantic itemNavigator InitialShortTimeout InitialShortTimeout < 0ArgumentOutOfRangeException DataContentFields DataContentFieldsArgumentNullException not specified DataHeaderFields DataHeaderFieldsArgumentNullException not specified DataSource DataSource notArgumentNullException specified AlphaDigit SemanticItem SemanticItem notArgumentNullException specified SemanticItem SemanticItem is notArgumentException a valid semantic item InputMask InputMask notArgumentNullException specified InputMask InputMask is not aArgumentException valid format NaturalNumber LowerBound LowerBound < 0or ArgumentOutOfRangeException LowerBound > Upperbound UpperBoundUpperBound > ArgumentOutOfRangeException 999,999 Currency SemanticItemSemanticItem not ArgumentNullException specified SemanticItemSemanticItem is not ArgumentException a valid semantic item PhoneAreaCodeSemanticItem AreaCodeSemanticItem ArgumentNullException notspecified AreaCodeSemanticItem AreaCodeSemanticItem ArgumentException isnot a valid semantic item LocalNumberSemanticItem LocalNumberSemanticArgumentNullException Item not specified LocalNumberSemanticItemLocalNumberSemantic ArgumentException Item is not a valid semantic itemExtensionSemanticItem ExtensionSemanticItem ArgumentException isspecified and is not a valid semantic item Zipcode ZipcodeSemanticItemZipcodeSemanticItem ArgumentNullException not specifiedZipcodeSemanticItem ZipcodeSemanticItem ArgumentException is not a validsemantic item ExtensionSemanticItem ExtensionSemanticItemArgumentException is specified and is not a valid semantic itemSocialSecurity Number SemanticItem SemanticItem notArgumentNullException specified SemanticItem SemanticItem is notArgumentException a valid semantic item Date DaySemanticItemDaySemanticItem not ArgumentNullException specified DaySemanticItemDaySemanticItem is ArgumentException not a valid semantic itemMonthSemanticItem MonthSemanticItem ArgumentNullException not specifiedMonthSemanticItem MonthSemanticItem ArgumentException is not a validsemantic item YearSemanticItem YearSemanticItem is ArgumentExceptionspecified and is not a valid semantic item FallbackCount FallbackCount <0 ArgumentOutOfRangeException YesNo SemanticItem SemanticItem notArgumentNullException specified SemanticItem SemanticItem is notArgumentException a valid semantic item CreditCard CreditCardsAllowedCreditCardsAllowed ArgumentException is null NumberSemanticItemNumberSemanticItem ArgumentNullException not specifiedNumberSemanticItem NumberSemanticItem ArgumentException is not a validsemantic item ExpirationMonthSemanticItem ExpirationMonthSemanticItemArgumentNullException not specified ExpirationMonthSemanticItemExpirationMonthSemanticItem ArgumentException is not a valid semanticitem ExpirationYearSemanticItem ExpirationYearSemanticItemArgumentNullException not specified ExpirationYearSemanticItemExpirationYearSemanticItem ArgumentException is not a valid semanticitem AllowVisa/ No credit card InvalidOperationException AllowAmex/types are allowed, AllowDiscover/ i.e., at least one AllowmasterCard/ ofthe properties AllowDinersClub is not true

[2668] 14 Issues Control/property Issue BasicApplicationControl/ thereis no way other than using a global BargeinType stylesheet to setbargeintype in app controls All controls Re-entry: 1. For controls thattake at least one target element, the control can be re- entered byclearing up the semantic items associated with one or more of the targetelements. Semantic items that are not cleared up will be consideredconfirmed. This solution assumes that all internal semantic items (notdirectly accessible to the authors) have been cleared when the controlstopped. It also assumes that authors will know about the functions toreset semantic items (these functions are not currently documented). 2.Controls provide a reset function that cab be used to reset all semanticitems and re-enter the control. (SDK review team decided not to providea re-entry story until a decision is taken on how to reset QA controls.The Application Controls will clean up their internal state beforeexiting.) The scheme used to get and confirm a series of digits isfairly ambitious. Although there does not seem to be blocking issues, weconsider that a fallback plan should be considered. In case we need tocut some features or reduce development time, the digit collection willbe done as follows: each chunk will be asked and explicitly confirmedone at a time. Extra-answers will allow users to provide more digitswhen answering the question. (agreed by dev team) AlphaDigit/InputMaskMake sure there are no IP issues with the mask notation (based on Spwxstuff). We could change the mask to typical regex notation. CurrencyWhat is decimal point in dtmf? YesNo If DTMF input is enabled, “1” meansyes and “2” means no. Is this correct/ok? CreditCard For expiration datein DTMF, what is allowed, i.e., do we allow “0”? Date Make sure thereare no trademark issues with the names of credit cards.BasicApplicationControl need to be consistent with exceptions on Phonesetting of start/stop element/event. should ZipCode we throw exceptionswhen an element is Date specified and an event is not (and vice-CreditCard versa).

Appendix

[2669] 15 DET Descriptions

[2670] The following table lists brief descriptions for each control,object and property. These descriptions will be used by the DET tool andbe exposed to the dialog author using Visual Studio. Control/Attribute/Method/ object Object Brief descriptionBasicApplicationControl class AllowCommands Whether or not commands maybe activated in the control BabbleTimeout The period of time inmilliseconds in which the recognizer must return a result afterdetection of speech Bargein Whether or not the playback of the promptmay be interrupted by the human listener CarrierGrammarURL URL of thegrammar containing carrier phrases ClientActivationFunction Client-sidefunction used to determine whether or not to activate the QA control.EndSilence Period of silence after the end of an utterance which must befree of speech after which recognition results are returnedInitialTimeout The time in milliseconds between start of recognition andthe detection of speech MaxTimeout The period of time in millisecondsbetween recognition start and results returned to the browserOnClientActiveFirst Client-side function called after control isdetermined to be active OnClientCompleteLast Client-side function calledafter execution of control (successfully or not) PostAnswerCarrierRuleName of the rule for the carrier phrase following an answerPreAnswerCarrierRule Name of the rule for the carrier phrase preceedingan answer PromptDatabase Name of the prompt databasePromptSelectFunction Function that selects and/or modifies a promptstring prior to playback QuestionPrompt Prompt of the main questionSpeechIndex Specifies control activation order ApplicationControl classAllowDtmf Whether or not DTMF input is allowed. AutoPostback Whether ornot to post back to the server each time user interacts with the controlCompleteLast Server-side function called when the CompleteLast eventfires ConfirmThreshold The minimum confidence level of recognitionnecessary to mark an item as confirmed ConfirmRejectThreshold Rejectionthreshold for the confirmation phase in this control FirstInitialTimeoutInitial timeout when QA.Count = = 1 Mode Recognition mode to be followedOnClientActive Client-side function called after each internal QA isdetermined to be active OnClientComplete Client-side function calledafter execution of each internal QA (successfully or not)OnClientListening Client-side function called after successful start ofthe reco object PostConfirmCarrierRule Name of the rule for the carrierphrase following a confirm PreConfirmCarrierRule Name of the rule forthe carrier phrase preceeding a confirm RejectThreshold Rejectionthreshold for this control StartElement ID of the GUI control whoseevent will activate recognition StartEvent Name of the GUI event thatwill activate recognition StopElement ID of the GUI control whose eventwill deactivate recognition StopEvent Name of the GUI event that willdeactivate recognition SingleItemChooser DataBindField Name of the datafield used for the text content of the list items DataMember The tableused for binding when a DataSet is used as a data source DataSource Thedata source used to populate the control with items DataTextField Nameof the data field used for the text content of the list itemsSemanticItem ID of the semantic item receiving the value spoken by theuser Navigator Columns Collection of ColumnTemplate objectsContentTemplate Template that defines how contents are playedDataContentFields Names of the data fields used to create the contentsDataHeaderFields Names of the data fields used to create the headersDataMember The table used for binding when a DataSet is used as a datasource DataSource The data source used to populate the control withitems DisableColumnNavigation Whether or not navigating to columncontent is allowed HeaderTemplate Template that defines how headers areplayed InitialShortTimeout Time period before Silence event is firedSemanticItem ID of the semantic item receiving the value spoken by theuser Currency PreferDollars Whether or not whole amounts are preferredwhen input is ambiguous AlphaDigit Grouping Enables/disables digitgrouping input InputMask Defines constraints to character or range inputSemanticItem ID of the semantic item receiving the value spoken by theuser Numeral SemanticItem ID of the semantic item receiving the valuespoken by the user LowerBound Smallest number accepted by the controlUpperBound Largest number accepted by the control ValidationEvent Whento validate that the number is within range Phone AreaCodeSemanticItemID of the semantic item receiving the area code spoken by the userLocalNumberSemanticItem ID of the semantic item receiving the localnumber spoken by the user ExtensionSemanticItem ID of the semantic itemreceiving the extension spoken by the user StartElementAreaCode ID ofthe GUI control whose event starts recognition of the area codeStopElementAreaCode ID of the GUI control whose event stops recognitionof the area code StartElementLocalNumber ID of the GUI control whoseevent starts recognition of the local number StopElementLocalNumber IDof the GUI control whose event stops recognition of the local numberStartElementExtension ID of the GUI control whose event startsrecognition of the extension StopElementExtension ID of the GUI controlwhose event stops recognition of the extension StartEventAreaCode Nameof the event that starts recognition of the area code partStopEventAreaCode Name of the event that stops recognition of the areacode part StartEventLocalNumber Name of the event that startsrecognition of the local number part StopEventLocalNumber Name of theevent that stops recognition of the local number partStartEventExtension Name of the event that starts recognition of theextension part StopEventExtension Name of the event that stopsrecognition of the extension part RequiresAreaCode Determines whether ornot the control asks for area code ZipCode ZipcodeSemanticItem ID of thesemantic item receiving the zipcode spoken by the userExtensionSemanticItem ID of the semantic item receiving the extensionspoken by the user StartElementZipcode ID of the GUI control whose eventstarts recognition of the zipcode StopElementZipcode ID of the GUIcontrol whose event stops recognition of the zipcode StartEventZipcodeName of the event that starts recognition of the zipcodeStopEventZipcode Name of the event that stops recognition of the zipcodeStartElementExtension ID of the GUI control whose event startsrecognition of the extension StopElementExtension ID of the GUI controlwhose event stops recognition of the extension StartEventExtension Nameof the event that starts recognition of the extension StopEventExtensionName of the event that stops recognition of the extensionSocialSecurityNumber SemanticItem ID of the semantic item receiving thenumber spoken by the user Separator Character that separates fields ofthe number Date DaySemanticItem ID of the semantic item receiving theday value spoken by the user MonthSemanticItem ID of the semantic itemreceiving the month value spoken by the user YearSemanticItem ID of thesemantic item receiving the year value spoken by the user DateContextSets the date preference of the control AllowRelativeDates Whether ornot the control accepts dates like “today” AllowHolidays Whether or notthe control accepts dates like “Christmas” AllowNumeralDates Whether ornot the control accepts numeral formats like “eleven five sixty two”StartElementDay ID of the GUI control whose event starts recognition ofthe day StartEventDay Name of the event that starts recognition of theday StartElementMonth ID of the GUI control whose event startsrecognition of the month StartEventMonth Name of the event that startsrecognition of the month StartElementYear ID of the GUI control whoseevent starts recognition of the year StartEventYear Name of the eventthat starts recognition of the year StopElementDay ID of the GUI controlwhose event stops recognition of the day StopEventDay Name of the eventthat stops recognition of the day StopElementMonth ID of the GUI controlwhose event stops recognition of the month StopEventMonth Name of theevent that stops recognition of the month StopElementYear ID of the GUIcontrol whose event stops recognition of the year StopEventYear Name ofthe event that stops recognition of the year FallbackCount Maximumnumber of attemps at

What is claimed is:
 1. A computer readable medium having instructions,which when executed on a computer generate client side markup for aclient in a client/server system, the instructions comprising: a firstset of visual controls having attributes for visual rendering on theclient device; a second set of controls having attributes related to atleast one of recognition and audibly prompting; and an applicationcontrol for performing a selected task, the application control havingproperties for outputting controls of the second set to perform theselected task and associating the outputted controls with the first setof controls.
 2. The computer readable medium of claim 1 wherein theselected task includes obtaining information.
 3. The computer readablemedium of claim 2 wherein the second set of controls includes means fordefining a prompt generating markup for providing a question.
 4. Thecomputer readable medium of claim 3 wherein the second set of thecontrols provides means for defining a confirmation for generatingmarkup related to confirming that a recognized result is correct.
 5. Thecomputer readable medium of claim 3 wherein the second set of controlsincludes means for defining a comparison to generate markup forcomparing a recognized result with a selected value.
 6. The computerreadable medium of claim 1 wherein the second set of controls includesmeans for maintaining a recognized result apart from the associatedcontrol of the first set of controls, said means for maintainingassociating the recognized result with the control of the first set ofcontrols.
 7. The computer readable medium of claim 6 wherein the meansfor maintaining the recognized result includes means for indicating thatthe recognized result has changed.
 8. The computer readable medium ofclaim 6 wherein the means for maintaining the recognized result includesmeans for indicating that the recognized result has been confirmed. 9.The computer readable medium of claim 6 and means for maintaining arecognized result includes maintaining a set of items for correspondingrecognized results, and wherein at least some of the items areindividually associated with controls of the first set of controls, andwherein states are maintained for at least some of the items, the statesincluding if the item is empty and if the item has been confirmed. 10.The computer readable medium of claim 2 wherein the application controlincludes a property that defines the format of input.
 11. The computerreadable medium of claim 10 wherein the input comprises at least one ofalphabetical characters and numerical characters.
 12. The computerreadable medium of claim 11 wherein property defines ranges of allowableinput.
 13. The computer readable medium of claim 2 wherein theinformation obtained is a number.
 14. The computer readable medium ofclaim 2 wherein the information obtained is numerical information in aselected format.
 15. The computer readable medium of claim 2 wherein theinformation obtained is a calendar date.
 16. The computer readablemedium of claim 2 wherein the information obtained is a yes or noanswer.
 17. The computer readable medium of claim 2 wherein the taskperformed is to navigate a table of information.
 18. The computerreadable medium of claim 17 wherein the application control implementsmarkup including receiving content commands to render information withinthe table.
 19. The computer readable medium of claim 17 wherein theapplication control includes a grammar associated with the table. 20.The computer readable medium of claim 19 wherein the table includes aplurality of header fields and wherein the grammar is associated withthe plurality of header fields.
 21. The computer readable medium ofclaim 20 wherein the table further includes column headings and whereinthe grammar is further associated with the column headings.
 22. Thecomputer readable medium of claim 17 wherein a set of navigationcommands include at least one of a next command and a previous command.23. The computer readable medium of claim 17 wherein the applicationimplements markup including rendering possible choices to a user. 24.The computer readable medium of claim 17 wherein the application controlimplements markup including receiving navigation commands to update aposition within the table.
 25. The computer readable medium of claim 24wherein the position identifies a particular row in the table.
 26. Thecomputer readable medium of claim 24 wherein the position furtheridentifies a particular column in the table.
 27. The computer readablemedium of claim 24 wherein the table includes a plurality of rows, eachrow having at least one header field and at least one content field, andwherein the position identifies one of the plurality of rows.
 28. Thecomputer readable medium of claim 27 wherein the application controlimplements markup including rendering the header field of said one ofthe plurality of rows when a position is updated.
 29. The computerreadable medium of claim 27 wherein the application control implementsmarkup including rendering the content field of said one of theplurality of rows upon receiving a content command.